Much of the software listed here is free and open source with major contributions from - or developed entirely by - dedicated community members. These efforts have resulted in a vast selection of reliable, high quality, and readily-available software. I especially attempt to highlight projects developed under this philopsophy that I have personally found to be useful or interesting.

1 The R Project

  • R The R Project for Statistical Computing.

  • CRAN A major repository of contributed R packages.

  • Rstudio An integrated development environment (IDE) for R users. An open source project maintained by the Posit company.

  • sf A comprehesive package for working with shapefiles and other geographic objects.

  • tidyverse Packages for data wrangling, plotting, and other fundamental data science tasks.

  • pbdR Packages for high-performance computing, including a Message Passing Interface (MPI) for R.

  • Rcpp Framework for integrating C++ into R.

  • The Rcpp API project documents the API in a human-readable way. Appears to be curated manually by the author and not auto-generated.

  • R-Nimble Framework for constructing hierarchical models with compiled code for high performance.

  • LaplacesDemon Full-featured library for Bayesian computation.

  • Roxygen2 Tool for documenting R packages by adding comment blocks with markup.

2 Platforms for Statistical and Scientific Computing

  • Stan Declare Bayesian models and compile into fast executable programs that sample from the posterior by Hamiltonian Monte Carlo. Integrates with several other programming languages including R via the rstan package.

  • MultiBUGS An actively maintained BUGS platform for Bayesian modeling, which automates MCMC sampling given a user’s model specification. Other BUGS variants are OpenBUGS and WinBUGS. JAGS also follows a similar paradigm.

  • The Julia Language A high-level and high-performance language for technical computing.

  • Python High-level language for data science and general computing.

  • GNU Octave A scientific computing platform that is largely compatible with MATLAB syntax, but is open source.

  • Wolfram Alpha Search engine that can compute integrals, limits, etc.

3 Libraries

  • GSL The GNU Scientific Library. Implements a number of numerical methods including integration, optimization, and root-finding.

  • Netlib A repository of code and documentation for numerical methods. A good place to find definitions of standard FORTRAN codes for standards such as BLAS and LAPACK.

  • Armadillo A matrix algebra library for C++. Its interface is intuitive and well-documented. The library is accessible to Rcpp users through the package RcppArmadillo.

  • Eigen Another matrix algebra library for C++. It is also available to Rcpp users through the package RcppEigen. I find this one more difficult to use than Armadillo.

  • STL The C++ Standard Template Library provides a number of data structures, iterators, and algorithms for C++ programs.

  • Boost Provides a suite of data structures, iterators, and algorithms like the STL. However, the STL is typically installed on machines with C++ compilers while Boost seems to be less standard.

  • autodiff Automatic differentiation (AD) in C++. Using special variables and operations to code functions, analytical derivatives may be calculated automatically. Helps to avoid tedious hand calculations and numerical accuracy issues in numerical derivatives, Implements a reverse mode and forward mode algorithm. Stan also has a library for AD which can be accessed in C++ with some work. A list of many other AD tools for various programming languages are listed at autodiff.org.

4 Document Authoring Tools

  • Latex System for preparing high quality documents. Especially useful for typsetting mathematics. A large selection of contributed packages is available on CTAN.

  • Quarto System for preparing articles, websites, books, and other documents with markdown and embedded R, Python, or Julia code. Code can be displayed with syntax highlighting and run dynamically as documents are compiled to facilitate reproducibility. Supports embedded Latex for typesetting math, BibTeX for references, and many other conveniences.

  • Beamer Package for making slides with Latex.

  • UMBCposter A Latex package for making posters. Developed and maintained by Dr. Rouben Rostamian at UMBC. Some alternatives are beamerposter and tikzposter.

  • TeXstudio IDE for authoring Latex documents.

  • Tikz Extensive Latex package for technical graphics such as diagrams and plots. Also see PGF and see PGFPlots.

5 Collaboration

  • Git Tool for source control management. Especially suitable for text-oriented material such as source code.

  • Codeberg a free hosting service for Git projects which is maintained by a non-profit organization in Berlin.

  • Bitbucket, Github, GitLab Paid services that host Git projects. They each have a free tier with limited hosting of private projects and use of large file storage, but are a good option for enlisting collaborators who do not otherwise have an account.

  • Overleaf Collaboration environment for writing Latex documents. Provides a browser-based editor, and also supports Git to work outside of the editor.

  • Box, DropBox, Google Drive Shared storage, for files other than source code

6 Productivity Tools

  • The GNU Project hosts a number of open source software projects, including the gcc compiler, make and autoconf build tools, as well as some of the other tools listed on the present page.

  • Slurm A scheduler to queue up computing jobs and run them as processors become available. Useful for running large simulations which can require hours or days. Most commonly used on multi-user systems and distributed computing clusters, but also useful on a PC.

6.1 Linux

  • Ubuntu A distribution of Linux maintained by the Canonical company. A number of variants are available such as a server edition and desktop editions with various window managers.

  • Arch Linux A distribution of Linux that follows a rolling release model. Good for keeping up with the most recent versions of packages. Also good for customization because the initial installation is minimal.

  • Linux The kernel for Linux distributions. Started as a personal project by Linus Torvalds and became a collaboration with contributors from around the world.

6.2 Graphical

  • Firefox A popular open source web browser.

  • Chromium The open source web browser on which Google Chrome is built.

  • Virtual Machine Manager An application to run and manage virtual machines. Uses the libvirt virtualization API. Both of these are open source.

  • OBS Studio Capture your PC’s display, audio, and video for recording or streaming.

  • Audacity Audio recording, editing, and processing.

  • Flowblade Video editing.

  • GIMP Image editing.

  • VLC a media player for audio and video.

  • dwm Extremely lightweight and minimalist tiling window manager for X11. Also see dwl for Wayland.

  • dmenu A lightweight X11 menu system which can be leveraged to create user defined menus. Menus can be traversed with the keyboard through fuzzy search. Something similar can be accomplished on the command line with fzf.

  • Xournalpp Take handwritten notes on your computer which can be edited and maintained as files. Especially useful with a writing tablet such as the Wacom Intuous.

  • Write Another excellent program for handwritten notes by Stylus Labs. Appears to be free but not completely open source like Xournal.

  • Dia Open source tool for drawing diagrams.

6.3 Terminal

  • Cygwin A collection of open source tools that provides a Linux-like experience in Windows, especially through the command line.

  • tmux Persistent terminal sessions that can be detached/attached and split into multiple panes. Here is a cheat sheet with some useful commands and keybindings.

  • screen An alternative to tmux; also provides persistent terminal sessions.

  • Vim Well-estabished text editor that can be used in non-graphical terminal environments. Tedious edits can often be accomplished with a small number of commands or keystrokes. A cheat sheet can be helpful to keep nearby.

  • Emacs An alternative to Vim which has a different feel.

  • Nano A user-friendly text editor for the terminal. Recommended over Vim and Emacs for new terminal users.

  • Bash One of the standard shells for Linux-based systems. Supports scripting which can be used to automate tedious tasks.

  • Mutt A well-established email client for the terminal. Also see NeoMutt which is a fork of Mutt that supports some additional features.

  • Alpine The successor of Pine which was another widely used email client for the terminal.

7 Professional Associations

8 U.S. Census Bureau

  • Center for Statistical Research and Methodology Information about the center and technical reports.

  • Census Data Explorer Explore and download data from the decennial census, American Community Survey, and other public data releases.

  • TIGER/Line Shapefiles Shapefiles for states, counties, and other geographic entities in the United States. Useful for plotting and spatial statistics.

  • 2020 Census Data Products Description of data products from the 2020 Decennial Census which make use of new methods to protect confidentiality. The release status for each product is given here.

9 University of Maryland, Baltimore County

10 Nostaglia

  • M-Net One of the first public-access UNIX systems. Hosted by the Arbornet organization in Ann Arbor, Michigan.

  • GeoCities Provided free hosting for personal websites.

  • Gopher An early protocol for browsing the web through a system of terminal-based menus. It lost traction as graphical HTTP-based browsers such as NCSA Mosiac became widely used.