Upgrade pacakge versions used in the tafe conda environment to be
consistent with the packages included in the new Anaconda 2023.07
distribution, which includes Python 3.11.
Changes for Version 4.27.0 (2023-04-13)
Upgrade pacakge versions used in the tafe conda environment to be
consistent with the packages included in the new Anaconda 2023.03-0
distribution, which includes Python 3.10.
Make a number of non-substantive code changes in order to suppress
new pylint 2.16 coding-style warnings.
Changes for Version 4.26.0 (2022-10-25)
Upgrade pacakge versions used in the tafe conda environment to be
consistent with the packages included in the new Anaconda 2022.10
distribution.
Make a number of non-substantive code changes in order to suppress
new pylint 2.14 coding-style warnings.
Changes for Version 4.25.0 (2022-09-24)
Revise Gini coefficient label in the CLI .tab output to clarify that
gini coefficient is calculated using just tax returns (rather than
the whole population).
Changes for Version 4.24.0 (2022-09-19)
Enhance the tacopyfiles development tool to revise new text in the
model installation documentation.
Changes for Version 4.23.0 (2022-08-15)
Add pyyaml package to list of packages required for the tafe conda
environment.
Changes for Version 4.22.0 (2022-07-13)
Add GZIP compress utility function.
Reduce Analyzer.MIN_XS_BIN_OBS from 90 to 55 so that cross-sectional
percentile graphs are generated for samples as small as 5,500 tax
filing units.
Changes for Version 4.21.0 (2022-05-12)
Implement the stratified_sample method in the SampleReWeighting class.
Add optional other-variable adjustment arguments to the MICE class
constructor and add associated logic; read the MICE constructor
documentation about the new ovar_idx and zero_ovar_below_abs
arguments for details on how this new adjustment works.
Add analyzer module logic that restricts table and graph output to
filers if a model has a nonfiler variable.
Revise setup.py and environment.yml files to require the newer
packages included in the just-released 2022.05 Anaconda distribution.
Add deprecation warning filters to the utils.py module to suppress
warnings generated by the new matplotlib 3.5 package.
Changes for Version 4.20.0 (2022-04-11)
Switch to using maximum likelihood estimation of alpha in the
IncomeUnderReporting class synthetic_adjustment_values_infer
method. This change improves the statistical precision of the
estimate and speeds execution, but produces essentially the
same results as does the Monte Carlo estimation technique used
previously.
Improve technique of ensuring alpha estimate is always positive in
the IncomeUnderReporting class synthetic_adjustment_values_mean method.
Expand the discussion of Problem 5 in the documentation of the data
preparation tools.
Changes for Version 4.19.0 (2022-04-03)
Add logic for the SampleReWeighting class uniform_weight_sample
method.
Add ability to avoid duplicate id values when the
IncomeUnderReporting class adjust method is adding extra records;
also add an optional verbose argument to the adjust method to get
details on the sample adjustment process.
Rename the IncomeUnderReporting class synthetic_adjustment_values
method as synthetic_adjustment_values_mean in order to clarify its
difference from the existing synthetic_adjustment_values_infer
method.
Changes for Version 4.18.0 (2022-03-16)
Enhance the IncomeUnderReporting class to have the ability to adjust
survey data in which some high incomes are completely missing
(because of non-response by the rich) and other high incomes are
under-reported. This adjustment capability uses an approach that is
similar, in part, to the approach advocated by Blanchet, Flores, and
Morgan (BFM) in their 2018 working paper.
Add to the documentation of the data-preparation tools an extensive
example that explains how to use the
enhanced IncomeUnderReporting tool to adjust survey data so it can
be used effectively as input to a tax microsimulation model.
Make several minor enhancements to the log_qq_plot.py utility that
compares two distributions by showing tabular results and plotting a
quantile-quantile graph of the two distributions of log values.
Changes for Version 4.17.0 (2022-02-27)
Add to the IncomeUnderReporting class the ability to infer from
survey data a Pareto distribution parameter value that can be used
to adjust high under-reported incomes in the same survey data. This
method is essentially the same as used in a recent OECD working
paper.
Add documentation on the new parameter inference capability in the
IncomeUnderReporting class.
Changes for Version 4.16.0 (2022-02-16)
Add an IncomeUnderReporting class that can adjust under-reported
high incomes (typically in survey data) using either actual micro
data (typically from tax-return data) or synthetic micro data
generated using a few macro assumptions about income under
reporting.
Add documentation on the data-preparation tools
recently added to the Framework. The documentation includes tool
descriptions as well several examples showing how to use the tools
to solve different kinds of data-preparation problems.
Changes for Version 4.15.0 (2022-02-06)
Add to the Multivariate Imputation by Chained Equations (MICE) class
the ability to handle efficiently monotone missing data patterns.
Add to the Multivariate Imputation by Chained Equations (MICE) class
the ability to adjust imputed values in several ways so that missing
not at random (MNAR) situations can be handled when the missing data
pattern is monotone.
Upgrade the run specification language parser to one that is built
using the newest 1.1.0 version of Lark.
Changes for Version 4.14.0 (2022-01-22)
Improve the Multivariate Imputation by Chained Equations (MICE)
class by streamlining its logic, adding a constructor argument,
renaming a constructor argument, and improving its documentation.
The basic logic is unchanged, but the logic streamlining causes
imputed results to be different from the results produced by the
prior 4.13 version even when using the same MICE random number seed.
Changes for Version 4.13.0 (2022-01-17)
Make the Python packages in the tafe environment be consistent
with the packages contained in the latest (2021.11) Anaconda Python
distribution, which will require removing the old
environment if you already have the
Tax-Analyzer-Framework package installed on your computer.
Add minor improvements to the gini utility function.
Add a Multivariate Imputation by Chained Equations (MICE) class that
imputes values that are missing at random (MAR) or missing
completely at random (MCAR). Read the docstring in the new
taf/mice.py module for details on how the MICE algorithm has been
implemented. Read the taf/dptests/test1.py module for a simple
example of using the MICE class in a Python script. For more
information on imputing missing data values, consult the van Buuren
book.
Changes for Version 4.12.0 (2021-11-08)
Add a gini utility function that computes a Gini coefficient for
weighted data and use that function to report in the CLI/GUI table
output Gini coefficients for before-tax and after-tax income for
both current law and the reform.
Changes for Version 4.11.0 (2021-10-21)
Add environment.yml file so that the conda tool can be used to
automate the process of installing Anaconda packages required by the
Framework and models built with it.
Revise installation documentation for the Framework and for each
Framework-derived model to explain how to create and activate the
Tax-Analyzer-Framework environment, tafe.
Revise documentation for each model built with the Framework to
explain how to activate the tafe environment when using or
developing the model.
Changes for Version 4.10.0 (2021-10-15)
Revise installation documentation to indicate that a 64-bit Anaconda
Python distribution must be installed in order to use
Framework-derived tax microsimulation models.
Clarify installation documentation given changes in versions 4.9.3
and 4.9.4 that align the minimum required package versions with
those included in the current (2021.05) Anaconda Python
distribution.
Improve GUI help on custom tabulation of database dump output by
adding instructions on how to start SQLite shell program to execute
SQL commands interactively.
Add two tests of the Framework's policy reform capabilities.
Changes for Version 4.9.4 (2021-09-29)
Revise minimum required matplotlib version from 3.4.0 (which was
released some time ago on March 26, 2021) to 3.3.4 (which is, for
some reason, the version still included in the Anaconda distribution
as of September 29, 2021).
Changes for Version 4.9.3 (2021-09-29)
Revise minimum required pandas version from 1.3.0 (which was
released some time ago on July 2, 2021) to 1.2.4 (which is, for some
reason, the version still included in the Anaconda distribution as
of late September, 2021).
Changes for Version 4.9.2 (2021-09-25)
Fix GUI class so that the list of CSV-formatted input data files,
which appears on the Run page, is updated each time the user changes
the Project directory.
Changes for Version 4.9.1 (2021-09-20)
Fix GUI class constructor so that it removes all files ending with
.txt or .html from the top-level project directory (so that the
newest versions of such help documentation always replace older
versions).
Fix the Growth class constructor so that any CSV-formatted file
containing growth factors can be used by the class.
Fix logic of GUI _parameter_value_is_valid function so that value
expressions containing -. are invalid (because they are illegal
JSON).
Changes for Version 4.9.0 (2021-09-05)
Update required Anaconda package versions in setup.py and in
www/anaconda.md documentation.
Call pandas DataFrame sum method with numeric_only=True argument in
order to avoid deprecation warning introduced when upgrading from
pandas 1.2 to 1.3.
Make non-substantive changes in code and add pylint-disable comments
in order to suppress pylint warnings that appear when upgrading from
pylint 2.7 to 2.9.
Changes for Version 4.8.0 (2021-08-30)
Add to plot utility the ability to generate a cross-section
after-tax income graph that displays data for a second alternative
reform.
Add --extract option to the csvshow tool.
Revise the tamakefast utility so that it uses numba.njit
parallel=True and prange in calcfuncsfast.py functions that contain
a loop through Records variables, a change that speeds up tax
calculations by using a thread for each CPU core.
Changes for Version 4.7.0 (2021-06-14)
Add aggregate tax change amount at the top of each cross-section
distributional graph.
Add to GUI Output page the ability to conduct custom tabulations,
using structured query language (SQL), of dump output written to an
SQLite3 database file. Each model comes with at least one such
custom SQL script, which can serve as a template for user-specified
SQL scripts. Both model-supplied and user-written SQL scripts are
available for tabulating dump output from the GUI Output page.
Add a new GUI Calcopts page that provides the ability to specify
options that control the model's mode of calculation.
Changes for Version 4.6.0 (2021-05-22)
Add to each cross-sectional graph output file an new column that
includes the value of income at the bottom of each percentile.
Revise the plot tool so that it can handle the new format of the
cross-sectional graph output files.
Changes for Version 4.5.1 (2021-05-21)
Fix incorrect revision in Framework's cross-sectional graphing logic
introduced in version 4.5.0 to enable cross-sectional graphing in
the presence of a bunched income distribution.
Changes for Version 4.5.0 (2021-05-20)
Add to graphical user interface (GUI) Output page the ability to
specify a non-zero minimum percentile for the cross-sectional
graphs, which provides a kind of graph zoom capability.
Revise Framework's cross-sectional graphing logic so that bunched
income distributions can be graphed without error.
Changes for Version 4.4.0 (2021-05-16)
Add to graphical user interface (GUI) the ability to specify a run
that generates dump output as either an CSV-formatted file or an
SQLite database file.
Changes for Version 4.3.0 (2021-05-10)
Improve graphical user interface (GUI) in numerous ways.
Test GUI on Windows.
Changes for Version 4.2.0 (2021-05-02)
Improve graphical user interface (GUI) in several ways, including
allowing GUI quit from any notebook page.
Changes for Version 4.1.0 (2021-04-25)
Add option to have only start-year weights in the weights file and
using a WEIGHTS growth factor to extrapolate the weights to
subsequent years.
Improve graphical user interface (GUI) in several ways, including
providing more descriptive names for files available on the Output
page.
Changes for Version 4.0.0 (2021-04-14)
Add graphical user interface (GUI) for all Framework-derived models.
Simplify specification of behavior response bin edges.
Changes for Version 3.2.0 (2021-03-17)
Require numba version 0.53.0 or higher in order to allow models
built with the Framework to run under Python 3.9.x versions (as well
as under Python 3.7.x and Python 3.8.x versions).
Changes for Version 3.1.0 (2021-03-08)
Make new model development quicker and easier by consolidating
definitions of model parameters and variables in a new
package.json file, reducing the number of JSON files from sixteen
to seven.
Changes for Version 3.0.2 (2021-02-28)
Fix Analyzer class so that preexisting output files for all years
are deleted at the beginning of a model run.
Fix Analyzer class so that language-words file includes list of
variables to which each growth factor is applied.
Enhance documentation of run specification language and RUNSPEC CLI.
Changes for Version 3.0.1 (2021-02-27)
Fix run specification language parser and interpreter to handle the
"USE DUMPVARS {};" instruction correctly.
Fix Behavior and Analyzer class logic so that marginal tax rates
computed in the behavior response logic are added to dump output
even if using minimal dump variables.
Enhance documentation of run specification language and RUNSPEC CLI.
Changes for Version 3.0.0 (2021-02-22)
Add a RUNSPEC CLI tool that accepts a single file describing all the
details of a model run using a simple run specification language,
which is described in the updated telling-the-model-what-to-do
documentation for the USW-Tax-Analyzer model, one of
the models built using the Framework.
Remove the JSON CLI --outdir option, but otherwise the JSON CLI
works as before.
Add a second parameter output file that contains just the parameters
whose value changes between current-law and the reform policy; this new
output file has a .pch extension.
Changes for Version 2.9.0 (2021-01-27)
Add the taunzip tool that unzips a zip file without decompressing
gz-compressed files that are inside the zip file. The taunzip
tool has a --password option that allows it to unzip encrypted zip
files that are password protected. This tool is helpful to those
working on Windows, which does not have this capability.
Add the jsoncheck tool that checks JSON syntax in a specified
file.
Changes for Version 2.8.0 (2021-01-24)
On Windows add to the tahead, tatail, and tagrep tools logic
that expends wildcard characters (* and ?) into a list of file names
(so that these tools have on Windows the same capabilities as they
do on Linux or on a Mac).
Changes for Version 2.7.0 (2021-01-20)
Enhance JSON file reading capability so that duplicate keys are
detected and generate an error message.
Minor documentation improvements.
Changes for Version 2.6.0 (2021-01-14)
Require only pandas version 1.1.3 or higher (because 1.2.0 is not
yet available in all Anaconda distributions).
Changes for Version 2.5.0 (2021-01-10)
Fix typo in screen output generated by the tatools utility.
Add documentation about SQLite3 as data analysis software.
Require pandas version 1.2.0 or higher, in order to avoid errors in
recent 1.1.? versions.
Changes for Version 2.4.0 (2021-01-07)
Add CLI --dbdump option that causes the dump output file to be
written as an SQLite3 database file (ending in .db instead of .csv)
that includes two tables of dumpvars, one table for the reform
(named ref) and one table for current-law policy (named clp). The
sqlite3 command-line shell, which is included in the Anaconda
Python distribution, can be used to do custom tabulations of the
dump output database using the SQL declarative table language.
Add --xmin option to the plot tool that generates standard graphs.
Drop support for Python 3.6 given that pandas will be doing that in
the near future.
Changes for Version 2.3.0 (2020-12-28)
Add tals CLI tool that displays information on all files and
subdirectories in the current working directory.
Changes for Version 2.2.0 (2020-12-27)
Revise Framework API so that each Framework-derived
model has a CLI --xstax output option, where the --xstax
output is in a tab-delimited text file containing for each income
percentile the reform-induced percentage change in tax liability.
Revise Framework API so that each Framework-derived
model's plot tool can graph the new --xstax output.
Changes for Version 2.1.0 (2020-12-22)
Revise Framework API to allow, but not require, a Framework-derived
model to have a CLI --xsetr output option, where the --xsetr
output is in a tab-delimited text file containing for each income
percentile the effective tax rate.
Revise Framework API to include a Plot class, and associated
command-line plot tool, that generates graphs (in svg, pdf, or eps
format) for output files generated by the CLI --tstax, --xsati,
and --xsetr options.
Changes for Version 2.0.0 (2020-12-17)
Revise Framework API to allow, but not require, a Framework-derived
model to have a CLI --tstax output option and/or a CLI --xsati
output option, where the --tstax output is in a tab-delimited text
file containing time-series aggregate taxes under current-law and
the reform for each simulated year and where the --xsati output is
in a tab-delimited text file containing for each income percentile
the reform-induced percentage change in after-tax income.
Changes for Version 1.15.1 (2020-12-13)
Fix Analyzer class logic so that matched data are correctly
extrapolated to the first tax year.
Revise CLI --help to clarify that matched data/weights files are
assumed to have the same year range as the data/weights files built
into the model package.
Changes for Version 1.15.0 (2020-12-05)
Revise logic so that Framework-derived models have a CLI that
automatically looks for a weights file whose name matches the
specified INPUT data file name; if there is a match, CLI uses the
matching weights file, and the specified growth factors, to
extrapolate variables and weights to future simulation years.
Add more error-checking to the csvjoin tool and fix typos in its
--help documentation.
Changes for Version 1.14 (2020-11-26)
Add quantiles utility function and associated test.
Changes for Version 1.13 (2020-11-24)
Extensive enhancement of documentation, including most notably the
CLI --test command writing language-words and language-rules
documentation files that support discussion of the declarative
language used to tell the CLI what to do.
Changes for Version 1.12 (2020-11-12)
Change repository name to Tax-Analyzer-Framework and change name of
package from psgtaxanalyzer to taf.
Revise documentation to reflect name change.
Changes for Version 1.11 (2020-11-02)
Enhance csvshow CLI tool to display information from a broader
range of CSV-formatted files.
Enhance csvjoin CLI tool so that --var variables (other than
those in the --svar list) do not have to be in every --csv file.
Changes for Version 1.10 (2020-10-26)
Add tacat CLI tool that displays all lines in one or more files.
Changes for Version 1.9 (2020-10-22)
Add tatool CLI utility that lists all the tools made available by
the PSG-Tax-Analyzer framework.
Generalize code so that behavior response analysis can be conducted
for a variable whose name contains one or more underscore characters.
Changes for Version 1.8 (2020-10-16)
Generalize Policy class constructor to handle the case where its
growth argument is a dictionary including raw inflation rates and
wage growth rates.
Add Calculator class methods that return the raw inflation rates and
wage growth rates from the Policy object embedded in a Calculator
object.
Changes for Version 1.7 (2020-10-11)
Add FREFORM CLI argument that allows specification in a tax model
for a sub-national jurisdiction of a JSON reform file for national
jurisdiction tax policy.
Add flexibility so that each model derived from the framework can
include or exclude from the CLI each of the following: the FREFORM
argument, the --response option, the --growdiff option, and the
--calcopts option.
Changes for Version 1.6 (2020-09-23)
Add a JSON specification file that includes the names of functions
in the calcfunctions.py module that either have a Calculator object
as its only argument or return a value that is not a records variable.
Add ability to require a read-in records variable to be less than or
equal to (or greater than or equal to) the value of another read-in
records variable.
Changes for Version 1.5 (2020-09-05)
Make minor code changes to avoid errors arising from upgrade to
pylint 2.6 and pandas 1.1; tax results are unchanged.
Update minimum package versions in documentation.
Changes for Version 1.4 (2020-08-27)
Fix utility function csv_show so that the csvshow tool can display
row content from a wider range of CSV-formatted files.
Changes for Version 1.3 (2020-08-21)
Revise tests to handle tax data files with names ending in .gz (as
well as .zip).
Revise test of valid values for scale_factor, which is in the
records_weights.json file.
Treat variables with data type numpy.float32 as float variables in
the df2csv utility function.
Add imports of numpy and pandas to automatically-generated
calcfuncsfast.py module.
Changes for Version 1.2 (2020-07-22)
Add CalculatorOptions class and CLI --calcopts to change default
values of calculator-options parameters.
Changes for Version 1.1 (2020-07-13)
Simplify names of CLI output files by eliminating any trailing -#
characters from the base output file name.
Changes for Version 1.0 (2020-07-10)
Add source code and tests that, after nearly a year of private
development by the Policy Simulation Group, represent a
substantially revised, enhanced, and corrected version of
Tax-Calculator, which was originally developed by the Open
Source Policy Center.
Changes for Version 0.1 (2020-07-03)
Add initial versions of documentation files; no source code
available.