Block-based General-to-Specific (GETS) modelling
blocksFun.RdAuxiliary function (i.e. not intended for the average user) that enables block-based GETS-modelling with user-specified estimator, diagnostics and goodness-of-fit criterion.
Usage
blocksFun(y, x, untransformed.residuals=NULL, blocks=NULL,
no.of.blocks=NULL, max.block.size=30, ratio.threshold=0.8,
gets.of.union=TRUE, force.invertibility=FALSE,
user.estimator=list(name="ols"), t.pval=0.001, wald.pval=t.pval,
do.pet=FALSE, ar.LjungB=NULL, arch.LjungB=NULL, normality.JarqueB=NULL,
user.diagnostics=NULL, gof.function=list(name="infocrit"),
gof.method=c("min", "max"), keep=NULL, include.gum=FALSE,
include.1cut=FALSE, include.empty=FALSE, max.paths=NULL,
turbo=FALSE, parallel.options=NULL, tol=1e-07, LAPACK=FALSE,
max.regs=NULL, print.searchinfo=TRUE, alarm=FALSE)Arguments
- y
a numeric vector (with no missing values, i.e. no non-numeric 'holes')
- x
a
matrix, or alistof matrices- untransformed.residuals
NULL(default) or, whenolsis used withmethod=6inuser.estimator, a numeric vector containing the untransformed residuals- blocks
NULL(default) or alistof lists with vectors of integers that indicate how blocks should be put together. IfNULL, then the block composition is undertaken automatically by an internal algorithm that depends onno.of.blocks,max.block.sizeandratio.threshold- no.of.blocks
NULL(default) orinteger. IfNULL, then the number of blocks is determined automatically by an internal algorithm- max.block.size
integerthat controls the size of blocks- ratio.threshold
numericbetween 0 and 1 that controls the minimum ratio of variables in each block to total observations- gets.of.union
logical. IfTRUE(default), then GETS modelling is undertaken of the union of retained variables. Otherwise it is not- force.invertibility
logical. IfTRUE, then the x-matrix is ensured to have full row-rank before it is passed on togetsFun- user.estimator
list, seegetsFunfor the details- t.pval
numericvalue between 0 and 1. The significance level used for the two-sided coefficient significance t-tests- wald.pval
numericvalue between 0 and 1. The significance level used for the Parsimonious Encompassing Tests (PETs)- do.pet
logical. IfTRUE, then a Parsimonious Encompassing Test (PET) against the GUM is undertaken at each variable removal for the joint significance of all the deleted regressors along the current GETS path. IfFALSE, then a PET is not undertaken at each removal- ar.LjungB
a two element
vector, orNULL. In the former case, the first element contains the AR-order, the second element the significance level. IfNULL, then a test for autocorrelation in the residuals is not conducted- arch.LjungB
a two element
vector, orNULL. In the former case, the first element contains the ARCH-order, the second element the significance level. IfNULL, then a test for ARCH in the residuals is not conducted- normality.JarqueB
NULLor anumericvalue between 0 and 1. In the latter case, a test for non-normality in the residuals is conducted using a significance level equal tonormality.JarqueB. IfNULL, then no test for non-normality is conducted- user.diagnostics
NULL(default) or alistwith two entries,nameandpval. SeegetsFunfor the details- gof.function
list. The first item should be namednameand contain the name (a character) of the Goodness-of-Fit (GOF) function used. Additional items in the listgof.functionare passed on as arguments to the GOF-function. . SeegetsFunfor the details- gof.method
character. Determines whether the best Goodness-of-Fit is a minimum (default) or maximum- keep
NULL(default),vectorof integers or alistof vectors of integers. In the latter case, the number of vectors should be equal to the number of matrices inx- include.gum
logical. IfTRUE, then the GUM (i.e. the starting model) is included among the terminal models- include.1cut
logical. IfTRUE, then the 1-cut model is added to the list of terminal models- include.empty
logical. IfTRUE, then the empty model is added to the list of terminal models- max.paths
NULL(default) orintegergreater than 0. IfNULL, then there is no limit to the number of paths. Ifinteger(e.g. 1), then this integer constitutes the maximum number of paths searched (e.g. a single path)- turbo
logical. IfTRUE, then (parts of) paths are not searched twice (or more) unnecessarily in each GETS modelling. SettingturbotoTRUEentails a small additional computational costs, but may be outweighed substantially if estimation is slow, or if the number of variables to delete in each path is large- parallel.options
NULLorintegerthat indicates the number of cores/threads to use for parallel computing (implemented w/makeClusterandparLapply)- tol
numericvalue, the tolerance for detecting linear dependencies in the columns of the variance-covariance matrix when computing the Wald-statistic used in the Parsimonious Encompassing Tests (PETs), see theqr.solvefunction- LAPACK
currently not used
- max.regs
integer. The maximum number of regressions along a deletion path. Do not alter unless you know what you are doing!- print.searchinfo
logical. IfTRUE(default), then a print is returned whenever simiplification along a new path is started- alarm
logical. IfTRUE, then a sound or beep is emitted (in order to alert the user) when the model selection ends
Details
blocksFun undertakes block-based GETS modelling by a repeated but structured call to getsFun. For the details of how to user-specify an estimator via user.estimator, diagnostics via
user.diagnostics and a goodness-of-fit function via gof.function, see documentation of getsFun under "Details".
The algorithm of blocksFun is similar to that of isat, but more flexible. The main use of blocksFun is the creation of user-specified methods that employs block-based GETS modelling, e.g. indicator saturation techniques.
Value
A list with the results of the block-based GETS-modelling.
References
F. Pretis, J. Reade and G. Sucarrat (2018): 'Automated General-to-Specific (GETS) Regression Modeling and Indicator Saturation for Outliers and Structural Breaks'. Journal of Statistical Software 86, Number 3, pp. 1-44
G. sucarrat (2020): 'User-Specified General-to-Specific and Indicator Saturation Methods'. The R Journal 12 issue 2, pp. 388-401, https://journal.r-project.org/archive/2021/RJ-2021-024/
See also
getsFun, ols, diagnostics, infocrit and isat
Examples
## more variables than observations:
y <- rnorm(20)
x <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, x)
#>
#> x block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> x block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> x block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> $call
#> blocksFun(y, x)
#>
#> $time.started
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $time.finished
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$x
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#>
#> $blocks
#> $blocks$x
#> $blocks$x[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$x[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$x[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $specific.spec
#> $specific.spec$x
#> integer(0)
#>
#>
## 'x' as list of matrices:
z <- matrix(rnorm(length(y)*40), length(y), 40)
blocksFun(y, list(x,z))
#>
#> X1 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X1 block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X1 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> X2 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 2 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> $call
#> blocksFun(y, list(x, z))
#>
#> $time.started
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $time.finished
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$X1
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#> $x$X2
#> [1] "X2.xreg1" "X2.xreg2" "X2.xreg3" "X2.xreg4" "X2.xreg5" "X2.xreg6"
#> [7] "X2.xreg7" "X2.xreg8" "X2.xreg9" "X2.xreg10" "X2.xreg11" "X2.xreg12"
#> [13] "X2.xreg13" "X2.xreg14" "X2.xreg15" "X2.xreg16" "X2.xreg17" "X2.xreg18"
#> [19] "X2.xreg19" "X2.xreg20" "X2.xreg21" "X2.xreg22" "X2.xreg23" "X2.xreg24"
#> [25] "X2.xreg25" "X2.xreg26" "X2.xreg27" "X2.xreg28" "X2.xreg29" "X2.xreg30"
#> [31] "X2.xreg31" "X2.xreg32" "X2.xreg33" "X2.xreg34" "X2.xreg35" "X2.xreg36"
#> [37] "X2.xreg37" "X2.xreg38" "X2.xreg39" "X2.xreg40"
#>
#>
#> $blocks
#> $blocks$X1
#> $blocks$X1[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X1[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X1[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#> $blocks$X2
#> $blocks$X2[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X2[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X2[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $specific.spec
#> $specific.spec$X1
#> integer(0)
#>
#> $specific.spec$X2
#> integer(0)
#>
#>
## ensure regressor no. 3 in matrix no. 2 is not removed:
blocksFun(y, list(x,z), keep=list(integer(0), 3))
#>
#> X1 block 1 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X1 block 2 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X1 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> X2 block 1 of 3:
#> 13 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#>
#> X2 block 2 of 3:
#> 14 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#> 13
#> 14
#>
#> X2 block 3 of 3:
#> 12 path(s) to search
#> Searching:
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
#> 11
#> 12
#>
#> GETS of union of retained X2 variables...
#>
#> $call
#> blocksFun(y, list(x, z), keep = list(integer(0), 3))
#>
#> $time.started
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $time.finished
#> [1] "Mon Apr 7 09:08:15 2025"
#>
#> $y
#> [1] 0.42646422 -0.29507148 0.89512566 0.87813349 0.82158108 0.68864025
#> [7] 0.55391765 -0.06191171 -0.30596266 -0.38047100 -0.69470698 -0.20791728
#> [13] -1.26539635 2.16895597 1.20796200 -1.12310858 -0.40288484 -0.46665535
#> [19] 0.77996512 -0.08336907
#>
#> $x
#> $x$X1
#> [1] "X1.xreg1" "X1.xreg2" "X1.xreg3" "X1.xreg4" "X1.xreg5" "X1.xreg6"
#> [7] "X1.xreg7" "X1.xreg8" "X1.xreg9" "X1.xreg10" "X1.xreg11" "X1.xreg12"
#> [13] "X1.xreg13" "X1.xreg14" "X1.xreg15" "X1.xreg16" "X1.xreg17" "X1.xreg18"
#> [19] "X1.xreg19" "X1.xreg20" "X1.xreg21" "X1.xreg22" "X1.xreg23" "X1.xreg24"
#> [25] "X1.xreg25" "X1.xreg26" "X1.xreg27" "X1.xreg28" "X1.xreg29" "X1.xreg30"
#> [31] "X1.xreg31" "X1.xreg32" "X1.xreg33" "X1.xreg34" "X1.xreg35" "X1.xreg36"
#> [37] "X1.xreg37" "X1.xreg38" "X1.xreg39" "X1.xreg40"
#>
#> $x$X2
#> [1] "X2.xreg1" "X2.xreg2" "X2.xreg3" "X2.xreg4" "X2.xreg5" "X2.xreg6"
#> [7] "X2.xreg7" "X2.xreg8" "X2.xreg9" "X2.xreg10" "X2.xreg11" "X2.xreg12"
#> [13] "X2.xreg13" "X2.xreg14" "X2.xreg15" "X2.xreg16" "X2.xreg17" "X2.xreg18"
#> [19] "X2.xreg19" "X2.xreg20" "X2.xreg21" "X2.xreg22" "X2.xreg23" "X2.xreg24"
#> [25] "X2.xreg25" "X2.xreg26" "X2.xreg27" "X2.xreg28" "X2.xreg29" "X2.xreg30"
#> [31] "X2.xreg31" "X2.xreg32" "X2.xreg33" "X2.xreg34" "X2.xreg35" "X2.xreg36"
#> [37] "X2.xreg37" "X2.xreg38" "X2.xreg39" "X2.xreg40"
#>
#>
#> $blocks
#> $blocks$X1
#> $blocks$X1[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X1[[2]]
#> [1] 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X1[[3]]
#> [1] 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#> $blocks$X2
#> $blocks$X2[[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14
#>
#> $blocks$X2[[2]]
#> [1] 3 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#>
#> $blocks$X2[[3]]
#> [1] 3 29 30 31 32 33 34 35 36 37 38 39 40
#>
#>
#>
#> $keep
#> $keep$X1
#> integer(0)
#>
#> $keep$X2
#> X2.xreg3
#> 3
#>
#>
#> $specific.spec
#> $specific.spec$X1
#> integer(0)
#>
#> $specific.spec$X2
#> [1] "X2.xreg3"
#>
#>