Tutorial

Purpose

This tutorial is an introduction to the SrMise library and command-line tool, intended to expose new users and developers to the major use cases and options anticipated for SrMise.

Generating interest in SrMise is another goal of these examples, and we hope you will discover exciting ways to apply its capabilities to your scientific goals. If you think SrMise may help you do so, please feel free to contact us through the DiffPy website.

http://www.diffpy.org

Overview

SrMise is an implementation of the ParSCAPE algorithm, which incorporates standard chi-square fitting within an iterative clustering framework. The algorithm supposes that, in the absence of an atomic structure model, the model complexity (informally, the number of extracted peaks) which can be justifiably obtained from a PDF is primarily determined by the experimental uncertainties. The Akaike Information Criterion (AIC), summarized in the manual, is the information-theoretic tool used to balance model complexity with goodness-of-fit.

Three primary use cases are envisioned for SrMise:

  1. Peak fitting, where user-specified peaks are fit to the experimental data.

  2. Peak extraction, where the number of peaks and their parameters are estimated solely from the experimental data.

  3. Multimodel selection, where multiple sets of peaks are ranked in an AIC-driven analysis to determine the most plausible sets to guide additional investigation.

Productively running SrMise requires, in basic, the following elements:

  1. An experimental PDF. Note that peak extraction, though not peak fitting, requires that all peaks of interest be positive. This rules out peak extraction using SrMise for neutron PDFs obtained from samples containing elements with both positive and negative scattering factors.

  2. The experimental uncertainties. In principle these should be reported with the data, but in practice experimental uncertainties are frequently not reported, or are unreliable due to details of the data reduction process. In these cases the user should specify an ad hoc value. In peak extraction an ad hoc uncertainty necessarily results in ad hoc model complexity, or, more precisely, a reasonable model complexity if the provided uncertainty is presumed correct. (Even when the uncertainties are known, specifying an ad hoc value can be a pragmatic tool for exploring alternate models, especially in conjunction with multimodeling analysis.) For both peak extraction and peak fitting the estimated uncertainties of peak parameters (i.e. location, width, intensity) are dependent on the experimental uncertainty.

  3. The PDF baseline. For crystalline samples the baseline is linear and can be readily estimated. For nanoparticles more effort is required as SrMise includes explicit support for only a few basic shapes, although the user can define a baseline using arbitrary polynomials or an interpolating function constructed from a list of arbitrary numerical values.

  4. The range over which to extract or fit peaks. By default SrMise will use the entire PDF, but it is usually wise to restrict the range to the region of immediate interest.

The examples described below, though not exhaustive, go into detail about each of these points. They also cover other parameters for which good default values can usually be estimated directly from the data.

Getting Started

The examples are contained in the doc/examples/ directory of the SrMise source distribution, available as both a |zip| and |tar.gz| archive. Download one of these files (Windows users will generally favor the .zip, while Linux/Mac users the .tar.gz) to a directory of your choosing.

Uncompress the archive. If the downloaded file is archivename.zip or archivename.tar.gz this will create a new directory archivename in its current directory. On Windows this can be accomplished by right-clicking and choosing “Extract all”. On Linux/Mac OS X run, from the containing directory,

tar xvzf archivename.tar.gz

From a command window change to the doc/examples directory of the new folder. For example, a Windows’ user who extracted archivename.zip in the folder C:\Research would type

cd C:\Research\archivename\doc\examples

Every example below includes a Python script that may be run from this directory. While such scripts expose the full functionality of SrMise, for many common tasks the command-line program srmise included with the package is both sufficient and convenient, and the tutorial uses it to introduce many fundamental concepts. Its options may be examined in detail by running

srmise --help

It is recommended to work through, in the order presented, at least the command-line portion of each example. Users looking for more detail should find the copiously commented scripts helpful.