Model specification

Before you estimate a model, you must load the data and specify the role of each variable in the model. This packages defines Microdata for that purpose.

Microdata(
        DF::DataFrame,
        model::Dict{Symbol, String};
        hints::Dict{Symbol, TermOrTerms},
        subset::AbstractVector{Bool},
        weights::AbstractWeights = UnitWeights(size(DF, 1))
        corr::CorrStructure,
    )

To construct a Microdata, two arguments are compulsory: a DataFrame, which contains the data, and a dictionary, which specifies the components of the model of interest. To construct this dictionary, use the macro @micromodel.

All regression models need a response, but other requirements may vary. (Check the documentation!) For example, OLS asks for response and control. In defining these sets, follow the syntax of Formula. (See the tutorial for examples.) Conventional sets include:

response: the response (a.k.a. outcome or dependent variable);
control: exogenous explanatory variables (n.b.: you must explicitly include intercepts, + 1);
offset: an exogenous variable whose coefficient is constrained to unity;
treatment: endogenous explanatory variables;
instrument: instrumental variables (i.e. excluded exogenous variables).

As for the keywords:

hints: a dictionary from column labels to schemas or contrasts.
subset determines the estimation sample. Set an entry to true if the corresponding row of DF should be included and false if it should be excluded. This keyword is useful if you are comparing subgroups and observations in different subgroups may correlate (e.g., they may belong to the same cluster). chow_test will take that correlation into account if the Microdata were constructed with subset.
weights is a weight vector. Except for frequency weights, the weight vector is normalized to sum up to the number of observations in the sample.
corr is a correlation structure.