diffpy.structure.parsers package
Submodules
diffpy.structure.parsers.p_auto module
Parser for automatic file format detection.
This Parser does not provide the the toLines() method.
- class diffpy.structure.parsers.p_auto.P_auto(**kw)[source]
Bases:
StructureParser
Parser with automatic detection of structure format.
This parser attempts to automatically detect the format of a given structure file and parse it accordingly. When successful, it sets its format attribute to the detected structure format.
- Parameters:
**kw (dict) – Keyword arguments for the structure parser.
- format
Detected structure format. Initially set to “auto” and updated after successful detection of the structure format.
- Type:
str
- pkw
Keyword arguments passed to the parser.
- Type:
dict
- parse(s)[source]
Detect format and create Structure instance from a string.
Set format attribute to the detected file format.
- Parameters:
s (str) – String with structure data.
- Returns:
Structure object.
- Return type:
- Raises:
- parseFile(filename)[source]
Detect format and create Structure instance from an existing file.
Set format attribute to the detected file format.
- Parameters:
filename (str) – Path to structure file.
- Returns:
Structure object.
- Return type:
- Raises:
StructureFormatError – If the structure format is unknown or invalid.
IOError – If the file cannot be read.
diffpy.structure.parsers.p_cif module
Parser for basic CIF file format.
- diffpy.structure.parsers.p_cif.rx_float
Constant regular expression for leading_float().
- Type:
re.Pattern
- diffpy.structure.parsers.p_cif.symvec
Helper dictionary for getSymOp().
- Type:
dict
Note
References: https://www.iucr.org/resources/cif
- class diffpy.structure.parsers.p_cif.P_cif(eps=None)[source]
Bases:
StructureParser
Simple parser for CIF structure format.
Reads Structure from the first block containing _atom_site_label key. Following blocks, if any, are ignored.
- Parameters:
eps (float, Optional) – Fractional coordinates cutoff for duplicate positions. When
None
use the default for ExpandAsymmetricUnit:1.0e-5
.
- format
Structure format name.
- Type:
str
- ciffile
Instance of CifFile from PyCifRW.
- Type:
CifFile
- spacegroup
Instance of SpaceGroup used for symmetry expansion.
- Type:
- eps
Resolution in fractional coordinates for non-equal positions. Used for expansion of asymmetric unit.
- Type:
float
- eau
Instance of ExpandAsymmetricUnit from SymmetryUtilities.
- Type:
- asymmetric_unit
List of Atom instances for the original asymmetric unit in the CIF file.
- Type:
list
- labelindex
Dictionary mapping unique atom label to index of Atom in self.asymmetric_unit.
- Type:
dict
- anisotropy
Dictionary mapping unique atom label to displacement anisotropy resolved at that site.
- Type:
dict
- cif_sgname
Space group name obtained by looking up the value of _space_group_name_Hall, _symmetry_space_group_name_Hall, _space_group_name_H-M_alt, _symmetry_space_group_name_H-M items.
None
when neither is defined.- Type:
str or None
- BtoU = 0.012665147955292222
Conversion factor from B values to U values.
- Type:
float
- parse(s)[source]
Create Structure instance from a string in CIF format.
- Parameters:
s (str) – A string in CIF format.
- Returns:
Structure instance.
- Return type:
- Raises:
StructureFormatError – When the data do not constitute a valid CIF format.
- parseFile(filename)[source]
Create Structure from an existing CIF file.
- Parameters:
filename (str) – Path to structure file.
- Returns:
Structure instance.
- Return type:
- Raises:
StructureFormatError – When the data do not constitute a valid CIF format.
IOError – When the file cannot be opened.
- parseLines(lines)[source]
Parse list of lines in CIF format.
- Parameters:
lines (list) – List of strings stripped of line terminator.
- Returns:
Structure instance.
- Return type:
- Raises:
StructureFormatError – When the data do not constitute a valid CIF format.
- diffpy.structure.parsers.p_cif.getParser(eps=None)[source]
Return new parser object for CIF format.
- Parameters:
eps (float, Optional) – fractional coordinates cutoff for duplicate positions. When
None
use the default for ExpandAsymmetricUnit:1.0e-5
.- Returns:
Instance of P_cif.
- Return type:
- diffpy.structure.parsers.p_cif.getSymOp(s)[source]
Create SpaceGroups.SymOp instance from a string.
- Parameters:
s (str) – Formula for equivalent coordinates, for example
'x,1/2-y,1/2+z'
.- Returns:
Instance of SymOp.
- Return type:
- diffpy.structure.parsers.p_cif.leading_float(s, d=0.0)[source]
Extract the first float from a string and ignore trailing characters.
Useful for extracting values from “value(std)” syntax.
- Parameters:
s (str) – The string to be scanned for floating point value.
d (float, Optional) – The default value when s is “.” or “?”, which in CIF format stands for inapplicable and unknown, respectively.
- Returns:
The extracted floating point value.
- Return type:
float
- Raises:
ValueError – When string does not start with a float.
diffpy.structure.parsers.p_discus module
Parser for DISCUS structure format
- class diffpy.structure.parsers.p_discus.P_discus[source]
Bases:
StructureParser
Parser for DISCUS structure format. The parser chokes on molecule and generator records.
- format
File format name, default “discus”.
- Type:
str
- nl
Line number of the current line being parsed.
- Type:
int
- lines
List of lines from the input file.
- Type:
list of str
- line
Current line being parsed.
- Type:
str
- stru
Structure being parsed.
- Type:
- ignored_lines
List of lines that were ignored during parsing.
- Type:
list of str
- cell_read
True
if cell record processed.- Type:
bool
- ncell_read
True
if ncell record processed.- Type:
bool
- parseLines(lines)[source]
Parse list of lines in DISCUS format.
- Parameters:
lines (list of str) – List of lines from the input file.
- Returns:
Parsed PDFFitStructure instance.
- Return type:
- Raises:
StructureFormatError – If the file is not in DISCUS format.
diffpy.structure.parsers.p_pdb module
Basic parser for PDB structure format.
Note
- class diffpy.structure.parsers.p_pdb.P_pdb[source]
Bases:
StructureParser
Simple parser for PDB format.
The parser understands following PDB records: TITLE, CRYST1, SCALE1, SCALE2, SCALE3, ATOM, SIGATM, ANISOU, SIGUIJ, TER, HETATM, END.
- format
Format name, default “pdb”.
- Type:
str
- atomLines(stru, idx)[source]
Build ATOM records and possibly SIGATM, ANISOU or SIGUIJ records for structure stru atom number aidx.
- orderOfRecords = ['HEADER', 'OBSLTE', 'TITLE', 'CAVEAT', 'COMPND', 'SOURCE', 'KEYWDS', 'EXPDTA', 'AUTHOR', 'REVDAT', 'SPRSDE', 'JRNL', 'REMARK', 'REMARK', 'REMARK', 'REMARK', 'DBREF', 'SEQADV', 'SEQRES', 'MODRES', 'HET', 'HETNAM', 'HETSYN', 'FORMUL', 'HELIX', 'SHEET', 'TURN', 'SSBOND', 'LINK', 'HYDBND', 'SLTBRG', 'CISPEP', 'SITE', 'CRYST1', 'ORIGX1', 'ORIGX2', 'ORIGX3', 'SCALE1', 'SCALE2', 'SCALE3', 'MTRIX1', 'MTRIX2', 'MTRIX3', 'TVECT', 'MODEL', 'ATOM', 'SIGATM', 'ANISOU', 'SIGUIJ', 'TER', 'HETATM', 'ENDMDL', 'CONECT', 'MASTER', 'END']
Ordered list of PDB record labels.
- Type:
list
- parseLines(lines)[source]
Parse list of lines in PDB format.
- Parameters:
lines (list of str) – List of lines in PDB format.
- Returns:
Parsed structure instance.
- Return type:
- Raises:
StructureFormatError – Invalid PDB record.
- toLines(stru)[source]
Convert Structure stru to a list of lines in PDB format.
- Parameters:
stru (Structure) – Structure to be converted.
- Returns:
List of lines in PDB format.
- Return type:
list of str
- validRecords = {'ANISOU': None, 'ATOM': None, 'AUTHOR': None, 'CAVEAT': None, 'CISPEP': None, 'COMPND': None, 'CONECT': None, 'CRYST1': None, 'DBREF': None, 'END': None, 'ENDMDL': None, 'EXPDTA': None, 'FORMUL': None, 'HEADER': None, 'HELIX': None, 'HET': None, 'HETATM': None, 'HETNAM': None, 'HETSYN': None, 'HYDBND': None, 'JRNL': None, 'KEYWDS': None, 'LINK': None, 'MASTER': None, 'MODEL': None, 'MODRES': None, 'MTRIX1': None, 'MTRIX2': None, 'MTRIX3': None, 'OBSLTE': None, 'ORIGX1': None, 'ORIGX2': None, 'ORIGX3': None, 'REMARK': None, 'REVDAT': None, 'SCALE1': None, 'SCALE2': None, 'SCALE3': None, 'SEQADV': None, 'SEQRES': None, 'SHEET': None, 'SIGATM': None, 'SIGUIJ': None, 'SITE': None, 'SLTBRG': None, 'SOURCE': None, 'SPRSDE': None, 'SSBOND': None, 'TER': None, 'TITLE': None, 'TURN': None, 'TVECT': None}
Dictionary of PDB record labels.
- Type:
dict
diffpy.structure.parsers.p_pdffit module
Parser for PDFfit structure format
- class diffpy.structure.parsers.p_pdffit.P_pdffit[source]
Bases:
StructureParser
Parser for PDFfit structure format.
- format
Format name, default “pdffit”.
- Type:
str
- ignored_lines
List of lines ignored during parsing.
- Type:
list
- stru
Structure instance used for cif input or output.
- Type:
- parseLines(lines)[source]
Parse list of lines in PDFfit format.
- Parameters:
lines (list of str) – List of lines in PDB format.
- Returns:
Parsed structure instance.
- Return type:
- Raises:
StructureFormatError – File not in PDFfit format.
diffpy.structure.parsers.p_rawxyz module
Parser for raw XYZ file format.
Raw XYZ is a 3 or 4 column text file with cartesian coordinates of atoms and an optional first column for atom types.
- class diffpy.structure.parsers.p_rawxyz.P_rawxyz[source]
Bases:
StructureParser
Parser –> StructureParser subclass for RAWXYZ format.
- format
Format name, default “rawxyz”.
- Type:
str
- parseLines(lines)[source]
Parse list of lines in RAWXYZ format.
- Parameters:
lines (list of str) – List of lines in RAWXYZ format.
- Returns:
Parsed structure instance.
- Return type:
- Raises:
StructureFormatError – Invalid RAWXYZ format.
diffpy.structure.parsers.p_xcfg module
Parser for extended CFG format used by atomeye.
- diffpy.structure.parsers.p_xcfg.AtomicMass
Dictionary of atomic masses for elements.
- Type:
dict
- class diffpy.structure.parsers.p_xcfg.P_xcfg[source]
Bases:
StructureParser
Parser for AtomEye extended CFG format.
- format
Format name, default “xcfg”.
- Type:
str
- cluster_boundary = 2
Width of boundary around corners of non-periodic cluster to avoid PBC effects in atomeye.
- Type:
int
- parseLines(lines)[source]
Parse list of lines in XCFG format.
- Parameters:
lines (list of str) – List of lines in XCFG format.
- Returns:
Parsed structure instance.
- Return type:
- Raises:
StructureFormatError – Invalid XCFG format.
- toLines(stru)[source]
Convert Structure stru to a list of lines in XCFG atomeye format.
- Parameters:
stru (Structure) – Structure to be converted.
- Returns:
List of lines in XCFG format.
- Return type:
list of str
- Raises:
StructureFormatError – Cannot convert empty structure to XCFG format.
diffpy.structure.parsers.p_xyz module
Parser for XYZ file format, where
First line gives number of atoms.
Second line has optional title.
Remaining lines contain element, x, y, z.
- class diffpy.structure.parsers.p_xyz.P_xyz[source]
Bases:
StructureParser
Parser for standard XYZ structure format.
- format
Format name, default “xyz”.
- Type:
str
- parseLines(lines)[source]
Parse list of lines in XYZ format.
- Parameters:
lines (list of str) – List of lines in XYZ format.
- Returns:
Parsed structure instance.
- Return type:
- Raises:
StructureFormatError – Invalid XYZ format.
diffpy.structure.parsers.parser_index_mod module
Index of recognized structure formats, their IO capabilities and associated modules where they are defined.
- diffpy.structure.parsers.parser_index_mod.parser_index
Dictionary of recognized structure formats. The keys are format names and the values are dictionaries with the following keys:
- modulestr
Name of the module that defines the parser class.
- file_extensionstr
File extension for the format, including the leading dot.
- file_patternstr
File pattern for the format, using ‘|’ as separator for multiple patterns.
- has_inputbool
True
if the parser can read the format.- has_outputbool
True
if the parser can write the format.
- Type:
dict
Note
Plugins for new structure formats need to be added to the parser_index dictionary in this module.
diffpy.structure.parsers.structureparser module
Definition of StructureParser, a base class for specific parsers.
- class diffpy.structure.parsers.structureparser.StructureParser[source]
Bases:
object
Base class for all structure parsers.
- format
Format name of particular parser.
- Type:
str
- filename
Path to structure file that is read or written.
- Type:
str
- parseLines(lines)[source]
Create Structure instance from a list of lines.
Return Structure object or raise StructureFormatError exception.
Note
This method has to be overloaded in derived class.
Module contents
Conversion plugins for various structure formats.
The recognized structure formats are defined by subclassing StructureParser, by convention these classes are named P_<format>.py. The parser classes should to override the parseLines() and toLines() methods of StructureParser. Any structure parser needs to be registered in parser_index module.
For normal usage it should be sufficient to use the routines provided in this module.
- Content:
StructureParser: base class for a concrete Parser
parser_index: dictionary of known structure formats
getParser: factory for Parser at given format
inputFormats: list of available input formats
outputFormats: list of available output formats
- diffpy.structure.parsers.getParser(format, **kw)[source]
Return Parser instance for a given structure format.
- Parameters:
format (str) – String with the format name, see parser_index_mod.
**kw (dict) – Keyword arguments passed to the Parser init function.
- Returns:
Parser instance for the given format.
- Return type:
Parser
- Raises:
StructureFormatError – When the format is not defined.