Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to build out unit cells from CIF files #11

Merged
merged 102 commits into from
Dec 20, 2024
Merged
Show file tree
Hide file tree
Changes from 80 commits
Commits
Show all changes
102 commits
Select commit Hold shift + click to select a range
656e9e1
Add _str2num and _deg2rad _utils
janbridley Apr 5, 2024
1e74eb7
Add cif file keys list to sample data
janbridley Apr 5, 2024
c369fd1
Add key_value_pairs reader and cell_params reader to parse
janbridley Apr 5, 2024
672c4e3
Add tests for key reader
janbridley Apr 5, 2024
e0b693f
Add tests for new utils
janbridley Apr 5, 2024
79350fc
Reorder test_key_reader
janbridley Apr 5, 2024
04b3344
Improve documentation for regex
janbridley Apr 5, 2024
b59eab1
Add warnings and tests to read_key_value_pairs
janbridley Apr 5, 2024
87303b9
Restore trailing spaces to downloaded CIF files
janbridley Apr 8, 2024
90120c7
Properly track keys containing "-"
janbridley Apr 8, 2024
d4203da
Improved tests for key value pair reader
janbridley Apr 8, 2024
8c3c014
Add key-value tests for INTENTIONALLY_BAD_CIF.cif
janbridley Apr 8, 2024
9c91bde
Fix docs
janbridley Apr 8, 2024
9aaba90
Enable top of page button
janbridley Apr 8, 2024
6ea7882
Update brand primary colors
janbridley Apr 8, 2024
0169783
Improve docs for parse.py
janbridley Apr 8, 2024
a404d19
Add __future__.annotations imports to relevant files
janbridley Apr 9, 2024
4903f80
Fix typo
janbridley Apr 10, 2024
a333c5c
Seperate _errors from _templates
janbridley Apr 10, 2024
b0f386b
Clean up docstring return types
janbridley Apr 10, 2024
96acd85
Add PDB cif to test suite
janbridley Apr 10, 2024
a6ebf33
Fix test in test_key_reader
janbridley Apr 10, 2024
f8dbaa3
Clean up patterns.py and add remove_nondelimiting_whitespace
janbridley Apr 10, 2024
b1e0bdd
Update table_reader to use remove_nondelimiting_whitespace
janbridley Apr 10, 2024
51328be
Allow value reader to read mmCIF files
janbridley Apr 10, 2024
06abb57
Update test_table_reader.py
janbridley Apr 10, 2024
98a2201
Remove seperate mmCIF reader
janbridley Apr 10, 2024
93909f8
Add docs for patterns module
janbridley Apr 10, 2024
d4d931b
Fix cast_to_float default value
janbridley Apr 10, 2024
1d86db9
Update docs
janbridley Apr 10, 2024
0528d36
Add documentation for __call__
janbridley Apr 10, 2024
40c7fb8
Update regex_filter param documentation
janbridley Apr 10, 2024
56c1e21
Fix typo
janbridley Apr 10, 2024
853a166
Remove unneeded comment
janbridley Apr 10, 2024
8b19268
Fix default values in docs
janbridley Apr 10, 2024
fd295a8
Fix typo
janbridley Apr 10, 2024
3e5e77c
Minor doc fix
janbridley Apr 10, 2024
ffa59a7
Fix typo
janbridley Apr 10, 2024
7f80005
Remove duplicate Introduction from index
janbridley Apr 10, 2024
1e8c01d
Remove duplicate entries from toc
janbridley Apr 10, 2024
56d80de
Add source for PDB cif
janbridley Apr 10, 2024
5d47d10
Add mmCIF flag to read_cell_params
janbridley Apr 10, 2024
dfbf5ed
Add quickstart.rst
janbridley Apr 10, 2024
28a7025
Fix comment in quickstart
janbridley Apr 10, 2024
e60cd1b
Remove unnecessary line in quickstart
janbridley Apr 10, 2024
6e82566
Fix image path in README.rst
janbridley Apr 11, 2024
a772261
Update regex documentation
janbridley Apr 11, 2024
7d03311
Fix CI
janbridley Apr 11, 2024
1f05fd7
Update __init__.py
janbridley Apr 15, 2024
0ddaa48
Add unitcells module
janbridley Apr 15, 2024
e1616ab
Add documentation links
janbridley Apr 15, 2024
04b19a3
Fix doc file naming
janbridley Apr 15, 2024
451fd8b
Remove resolved TODO
janbridley Apr 15, 2024
bba2849
Add top level description to unitcells
janbridley Apr 15, 2024
8d243ad
Default regex filters to None
janbridley Apr 15, 2024
5fece58
Fix default setting for nondelimiting_whitespace_replacement
janbridley Apr 15, 2024
85b7dff
Remove outdated comment
janbridley Apr 15, 2024
f4aefa7
Fix tests
janbridley Apr 15, 2024
43f0263
Add tests for symmetry operations
janbridley Apr 15, 2024
a713c06
Fix precision issues
janbridley Apr 15, 2024
a0a01cd
Increase string lines threshold
janbridley Apr 15, 2024
8c7f2c0
Remove in-file tests
janbridley Apr 15, 2024
8d5969d
Return unrounded values
janbridley Apr 15, 2024
923ce2c
Add test_extract_unit_cell
janbridley Apr 15, 2024
f96c7a9
Add distance calculation util for uniqueness comparison
janbridley Apr 26, 2024
b124cac
Add function to build basis vector matrix from box
janbridley Apr 26, 2024
cbf17ba
Update unitcell builder and rename to extract_atomic_positions
janbridley Apr 26, 2024
6ab5cd6
Update docstrings
janbridley Apr 26, 2024
e0eccb9
Clarify function naming
janbridley Apr 26, 2024
9bb9b2f
Change space group for IncStrDb_Ccmm.cif to standard format
janbridley Apr 26, 2024
a19b068
Remove unnecessary transpose in basis vector function
janbridley Apr 26, 2024
b3803be
Update unitcell tests to use ase
janbridley Apr 26, 2024
bf532c6
Filter out ase warnings
janbridley Apr 26, 2024
037d5d8
Switch catch_warnings to filterwarnings for python 3.9 compat
janbridley Apr 26, 2024
a1382d3
Add pytest to test requirements
janbridley Apr 27, 2024
b45218d
Fix top of page buttons and add view link
janbridley May 10, 2024
2ab4288
Fix logo on README.rst
janbridley May 10, 2024
e4cbbb8
Merge remote-tracking branch 'origin/main' into feature/supercells
janbridley May 22, 2024
e35ed1b
Merge branch 'main' into feature/supercells
janbridley Jun 3, 2024
bdc832d
Add readable assertion error in unitcells.py
janbridley Jun 10, 2024
c30d23e
Improve assertion error in _safe_eval
janbridley Jun 10, 2024
0632944
Remove unused distance-merge code
janbridley Jun 10, 2024
a023be2
Merge branch 'main' into feature/supercells
janbridley Dec 19, 2024
d65e094
Improve CI resilience
janbridley Dec 19, 2024
ccb13fd
Remove dependabot
janbridley Dec 19, 2024
e562353
Add requirements.txt for py3.6 and py3.7
janbridley Dec 19, 2024
d50b3ee
Update requirements.yaml CI action
janbridley Dec 19, 2024
78fbd68
Remove temporary lines from ci
janbridley Dec 19, 2024
9a2a5e7
Add future-annotations package
janbridley Dec 19, 2024
e0f186b
Fix annotations
janbridley Dec 19, 2024
22b145f
Disable testing on py3.6
janbridley Dec 20, 2024
f046aa0
Update doc requirements
janbridley Dec 20, 2024
3a3dbde
Decapitalize changelog and credits
janbridley Dec 20, 2024
e91767d
Swap tests to us UV
janbridley Dec 20, 2024
77184f6
Fix uv version
janbridley Dec 20, 2024
f3ae2d1
Clean up CI script
janbridley Dec 20, 2024
b85b73b
Activate venv
janbridley Dec 20, 2024
606be0b
Remove setup.py
janbridley Dec 20, 2024
a1be853
Fix CI
janbridley Dec 20, 2024
446f441
Clean up CI
janbridley Dec 20, 2024
cefdb5c
Simplify CI
janbridley Dec 20, 2024
7385f4b
Clean up overview
janbridley Dec 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 0 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,7 @@

.. image:: doc/source/_static/parsnip_header_dark.svg
:width: 600
:class: only-light

.. image:: doc/source/_static/parsnip_header_light.svg
:width: 600
:class: only-dark

.. _header:

Expand Down
7 changes: 4 additions & 3 deletions doc/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

source_repository = "https://github.com/glotzerlab/parsnip/"
html_theme = "furo"
html_static_path = ["_static"]
html_theme_options = {
Expand All @@ -59,7 +59,8 @@
"color-brand-primary": "#005A50",
"color-brand-content": "#406a8c",
},
"top_of_page_button": "edit",
"source_edit_link": "https://github.com/glotzerlab/parsnip",
"source_edit_link": "https://github.com/glotzerlab/parsnip/edit/main/doc/source/{filename}",
"source_view_link": "https://github.com/glotzerlab/parsnip",
}

html_favicon = "_static/parsnip_logo_favicon.svg"
1 change: 1 addition & 0 deletions doc/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@

package-parse
package-patterns
package-unitcells


.. toctree::
Expand Down
7 changes: 7 additions & 0 deletions doc/source/package-unitcells.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Unitcells Module
==============================

.. rubric:: Overview

.. automodule:: parsnip.unitcells
:members:
14 changes: 12 additions & 2 deletions parsnip/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,14 @@
"""TODO: Add docstring."""
from . import parse, patterns
"""``parsnip``: a package for the simple reading and processing of .cif files.

While there are many packages for handling cif files exist, the vast majority suffer
from decades of feature creep and high levels of complexity. ``parsnip`` provides a
simple and minimal interface for reading cif files into Python primitive data structures
and numpy arrays. The ``parsnip.parse`` module contains exactly two functions that read
key-value and tabular data from cif files, and are all that are required for most users.
The ``parsnip.patterns`` module includes a few convience features for manipulation of
the read data, and the ``parsnip.unitcells`` module includes functions to reconstruct a
crystal's unit cell's basis positions from data stored in cif files.
"""
from . import parse, patterns, unitcells

__version__ = "0.0.2"
13 changes: 13 additions & 0 deletions parsnip/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,16 @@ def _str2num(val: str):
def _deg2rad(val: float):
"""Convert a value in degrees to one in radians."""
return val * np.pi / 180


def _get_distances(positions: np.ndarray):
# Get all indices i!=j
i_indices, j_indices = np.triu_indices(len(positions), k=1)

# Compute difference vectors
r_xyz = positions[i_indices] - positions[j_indices]

# Compute distances from vectors.
ij_distances = np.einsum("ij,ij->i", r_xyz, r_xyz, optimize="optimal")

return ij_distances, i_indices, j_indices
72 changes: 3 additions & 69 deletions parsnip/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
This is an example of a simple CIF file. A `key`_ (data name or tag) must start with
an underscore, and is seperated from the data value with whitespace characters.
A `table`_ begins with the ``loop_`` keyword, and contain a header block and a data
block. The vertical position of a tag in the table heading corresponds with the
block. The vertical position of a tag in the table headings corresponds with the
horizontal position of the associated column in the table values.

.. code-block:: text
Expand Down Expand Up @@ -49,7 +49,7 @@
import numpy as np

from ._errors import ParseError, ParseWarning
from ._utils import _deg2rad, _str2num
from ._utils import _str2num
from .patterns import LineCleaner, cast_array_to_float, remove_nondelimiting_whitespace


Expand Down Expand Up @@ -84,7 +84,7 @@ def read_table(
nondelimiting_whitespace_replacement (str, optional):
Character to replace non-delimiting whitespaces with.
Default value = ``"_"``
regex_filter (tuple[str,str], optional):
regex_filter (tuple[str,str] | tuple[tuple[str,str]], optional):
A tuple of strings that are compiled to a regex filter and applied to each
data line. If a tuple of tuples of strings is provided instead, each pattern
will be applied seperately.
Expand Down Expand Up @@ -314,69 +314,3 @@ def read_key_value_pairs(
)

return data


def read_cell_params(filename, degrees: bool = True, mmcif: bool = False):
r"""Read the cell lengths and angles from a CIF file.

Args:
filename (str): The name of the .cif file to be parsed.
degrees (bool, optional):
When True, angles are returned in degrees (as per the cif spec). When False,
angles are converted to radians.
Default value = ``True``
mmcif (bool, optional):
When False, the standard CIF key naming is used (e.g. _cell_angle_alpha).
When True, the mmCIF standard is used instead (e.g. cell.angle_alpha).
Default value = ``False``

Returns:
tuple:
The box vector lengths and angles in degrees or radians
:math:`(L_1, L_2, L_3, \alpha, \beta, \gamma)`.
"""
if mmcif:
angle_keys = ("_cell.angle_alpha", "_cell.angle_beta", "_cell.angle_gamma")
box_keys = ("_cell.length_a", "_cell.length_b", "_cell.length_c") + angle_keys
else:
angle_keys = ("_cell_angle_alpha", "_cell_angle_beta", "_cell_angle_gamma")
box_keys = ("_cell_length_a", "_cell_length_b", "_cell_length_c") + angle_keys
cell_data = read_key_value_pairs(filename, keys=box_keys, only_read_numerics=True)

assert all(value is not None for value in cell_data.values())
assert all(0 < cell_data[key] < 180 for key in angle_keys)

if not degrees:
for key in angle_keys:
cell_data[key] = _deg2rad(cell_data[key])

return tuple(cell_data.values())


def read_fractional_positions(
filename: str,
regex_filter: tuple = ((r",\s+", ",")),
):
r"""Extract the fractional X,Y,Z coordinates from a CIF file.

Args:
filename (str): The name of the .cif file to be parsed.
regex_filter (tuple[tuple[str,str]], optional):
A tuple of strings that are compiled to a regex filter and applied to each
data line. Default value = ``((r",\s+",","))``

Returns:
:math:`(N, 3)` :class:`numpy.ndarray[np.float32]`:
Fractional X,Y,Z coordinates of the unit cell.
"""
xyz_keys = ("_atom_site_fract_x", "_atom_site_fract_y", "_atom_site_fract_z")
# Once #6 is added, we should warnings.catch_warnings(action="error")
xyz_data = read_table(filename=filename, keys=xyz_keys, regex_filter=regex_filter)

xyz_data = cast_array_to_float(arr=xyz_data, dtype=np.float32)

# Validate results
assert xyz_data.shape[1] == 3
assert xyz_data.dtype == np.float32

return xyz_data
Loading
Loading