binspwc
implements hypothesis testing procedures for pairwise group comparison of binscatter estimators, following the
results in Cattaneo, Crump, Farrell and Feng (2021a).
If the binning scheme is not set by the user, the companion function
binsregselect
is used to implement binscatter in a datadriven way. Binned scatter plots based on different methods
can be constructed using the companion functions binsreg
, binsqreg
or binsglm
.
Hypothesis testing for parametric functional forms of and shape restrictions on the regression function of interest can
be conducted via the companion function binstest
.
1 2 3 4 5 6 7 8 9  binspwc(y, x, w = NULL, data = NULL, estmethod = "reg",
family = gaussian(), quantile = NULL, deriv = 0, at = NULL,
nolink = F, by = NULL, pwc = c(3, 3), testtype = "twosided",
lp = Inf, bins = c(2, 2), bynbins = NULL, binspos = "qs",
binsmethod = "dpi", nbinsrot = NULL, samebinsby = FALSE,
randcut = NULL, nsims = 500, simsgrid = 20, simsseed = NULL,
vce = NULL, cluster = NULL, asyvar = F, dfcheck = c(20, 30),
masspoints = "on", weights = NULL, subset = NULL, numdist = NULL,
numclust = NULL, ...)

y 
outcome variable. A vector. 
x 
independent variable of interest. A vector. 
w 
control variables. A matrix, a vector or a 
data 
an optional data frame containing variables used in the model. 
estmethod 
estimation method. The default is 
family 
a description of the error distribution and link function to be used in the generalized linear model when 
quantile 
the quantile to be estimated. A number strictly between 0 and 1. 
deriv 
derivative order of the regression function for estimation, testing and plotting.
The default is 
at 
value of 
nolink 
if true, the function within the inverse link function is reported instead of the conditional mean function for the outcome. 
by 
a vector containing the group indicator for subgroup analysis; both numeric and string variables
are supported. When 
pwc 
a vector. 
testtype 
type of pairwise comparison test. The default is 
lp 
an Lp metric used for (twosided) parametric model specification testing and/or shape restriction testing. The default is 
bins 
A vector. Degree and smoothness for bin selection. The default is 
bynbins 
a vector of the number of bins for partitioning/binning of 
binspos 
position of binning knots. The default is 
binsmethod 
method for datadriven selection of the number of bins. The default is 
nbinsrot 
initial number of bins value used to construct the DPI number of bins selector. If not specified, the datadriven ROT selector is used instead. 
samebinsby 
if true, a common partitioning/binning structure across all subgroups specified by the option 
randcut 
upper bound on a uniformly distributed variable used to draw a subsample for bins selection.
Observations for which 
nsims 
number of random draws for hypothesis testing. The default is

simsgrid 
number of evaluation points of an evenlyspaced grid within each bin used for evaluation of
the supremum (infimum or Lp metric) operation needed to construct hypothesis testing
procedures. The default is 
simsseed 
seed for simulation. 
vce 
procedure to compute the variancecovariance matrix estimator. For least squares regression and generalized linear regression, the allowed options are the same as that for 
cluster 
cluster ID. Used for compute clusterrobust standard errors. 
asyvar 
If true, the standard error of the nonparametric component is computed and the uncertainty related to control
variables is omitted. Default is 
dfcheck 
adjustments for minimum effective sample size checks, which take into account number of unique
values of 
masspoints 
how mass points in

weights 
an optional vector of weights to be used in the fitting process. Should be 
subset 
optional rule specifying a subset of observations to be used. 
numdist 
Number of distinct for selection. Used to speed up computation. 
numclust 
Number of clusters for selection. Used to speed up computation. 
... 
optional arguments to control bootstrapping if 

A matrix. Each row corresponds to the comparison between two groups. The first column is the test statistic. The second and third columns give the corresponding group numbers.
The null hypothesis is 

A vector of pvalues for all pairwise group comparisons. 

A list containing options passed to the function, as well as 
Matias D. Cattaneo, Princeton University, Princeton, NJ. cattaneo@princeton.edu.
Richard K. Crump, Federal Reserve Bank of New York, New York, NY. richard.crump@ny.frb.org.
Max H. Farrell, University of Chicago, Chicago, IL. max.farrell@chicagobooth.edu.
Yingjie Feng (maintainer), Tsinghua University, Beijing, China. fengyingjiepku@gmail.com.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2021a: On Binscatter. Working Paper.
Cattaneo, M. D., R. K. Crump, M. H. Farrell, and Y. Feng. 2021b: Binscatter Regressions. Working Paper.
binsreg
, binsqreg
, binsglm
, binsregselect
, binstest
.
1 2 3 
