Model specification
Before you estimate a model, you must load the data and specify the role of each variable in the model. This packages defines Microdata
for that purpose.
Microdata(
DF::DataFrame,
model::Dict{Symbol, String};
hints::Dict{Symbol, TermOrTerms},
subset::AbstractVector{Bool},
weights::AbstractWeights = UnitWeights(size(DF, 1))
corr::CorrStructure,
)
To construct a Microdata
, two arguments are compulsory: a DataFrame
, which contains the data, and a dictionary, which specifies the components of the model of interest. To construct this dictionary, use the macro @micromodel
.
All regression models need a response
, but other requirements may vary. (Check the documentation!) For example, OLS
asks for response
and control
. In defining these sets, follow the syntax of Formula
. (See the tutorial for examples.) Conventional sets include:
response
: the response (a.k.a. outcome or dependent variable);control
: exogenous explanatory variables (n.b.: you must explicitly include intercepts,+ 1
);offset
: an exogenous variable whose coefficient is constrained to unity;treatment
: endogenous explanatory variables;instrument
: instrumental variables (i.e. excluded exogenous variables).
As for the keywords:
hints
: a dictionary from column labels to schemas or contrasts.subset
determines the estimation sample. Set an entry totrue
if the corresponding row ofDF
should be included andfalse
if it should be excluded. This keyword is useful if you are comparing subgroups and observations in different subgroups may correlate (e.g., they may belong to the same cluster).chow_test
will take that correlation into account if theMicrodata
were constructed withsubset
.weights
is a weight vector. Except for frequency weights, the weight vector is normalized to sum up to the number of observations in the sample.corr
is a correlation structure.