PyMKS Overview

Travis CI License Documentation Status PyPI version Circle CI

PyMKS is an open source, Pythonic implementation of the methodologies developed under the aegis of Materials Knowledge System (MKS) to build salient process-structure-property linkages for materials science applications. PyMKS provides for efficient tools for obtaining a digital, uniform grid representation of a materials internal structure in terms of its local states, and computing hierarchical descriptors of the structure that can be used to build efficient machine learning based mappings to the relevant response space.

The various materials data analytics workflows developed under the MKS paradigm confirm to the data transformation pipeline architecture typical to most Data Science workflows. The workflows can be boiled down to a data preprocessing step, followed by a feature generation step (fingerprinting), and a model construction step (including hyper parameter optimization). PyMKS, written in a functional programming style and supporting distributed computation (multi-core, multi-threaded, cluster), provides modular functionalities to address each of these data transformation steps, while maximally leveraging the capabilities of the underlying computing environment.

PyMKS consists of tools to compute 2-point statistics, tools for both homogenization and localization linkages, and tools for discretizing the microstructure. In addition, PyMKS has modules for generating synthetic data sets using conventional numerical simulations.

To learn about PyMKS start with the PyMKS examples, especially the introductory example. To learn more about the methods consult the technical overview for an introduction.

The two principle objects that PyMKS provides are the TwoPointCorrelation transformer and the LocalizationRegressor which provide the homogenization and localization functionality. The objects provided by PyMKS all work as either transformers or regressors in a Scikit-Learn pipeline and use both Numpy and Dask arrays for out-of-memory, distributed or parallel computations. The out-of-memory computations are still in an experimental stage as of version 0.4 and some issues still need to be resolved.

Feedback

Please submit questions and issues on the GitHub issue tracker.

Installation

Conda

To install using Conda,

$ conda install -c conda-forge pymks

To create a development environment clone this repository and run

$ conda env create -f environment.yml
$ conda activate pymks
$ python setup.py develop

in the base directory.

Pip

Install a minimal version of PyMKS with

$ pip install pymks

This is enough to run the tests, but not the examples. Some optional packages are not available via Pip. To create a development environment clone this repository and run

$ pip install .

in the base directory.

Nix

Follow the Nix installation guide and then run

$ export NIX_VERSION=21.05
$ export PYMKS_VERSION=0.4.1
$ nix-shell \
    -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/${NIX_VERSION}.tar.gz \
    -I pymks=https://github.com/materialsinnovation/pymks/archive/${PYMKS_VERSION}.tar.gz \
    -E 'with (import <nixpkgs> {}); mkShell { buildInputs = [ (python3Packages.callPackage <pymks> { }) ]; }'

to drop into a shell with PyMKS and all its requirements available. To create a development environment with Nix clone this repository and run

$ nix-shell

in the base directory.

Docker

PyMKS has a docker image avilable via docker.io. Assuming that you have a working version of Docker, use

$ docker pull docker.io/wd15/pymks
$ docker run -i -t -p 8888:8888 wd15/fipy:latest
# jupyter notebook --ip 0.0.0.0 --no-browser

The PyMKS example notebooks are available inside the image after opening the Jupyter notebook from http://127.0.0.1:8888. See DOCKER.md for more details.

Optional Packages

Packages that are optional when using PyMKS.

Sfepy

Sfepy is a python based finite element solver. It’s useful for generating data for PyMKS to use for machine learning tasks. It’s used in quite a few tests, but it isn’t strictly necessary to use PyMKS. Sfepy will automatically install when using Nix or Conda, but not when using Pip. See the Sfepy installation instructions to install in your environment.

GraSPI

GraSPI is a C++ library with a Python interface for creating materials descriptors using graph theory. See the API documentation for more details. Currently, only the Nix installation builds with GraSPI by default To switch off GraSPI when using Nix use,

$ nix-shell --arg withGraspi false

Testing

To test a PyMKS installation use

$ python -c "import pymks; pymks.test()"

Citing

Please cite the following if you happen to use PyMKS for a publication.

  • Brough, D.B., Wheeler, D. & Kalidindi, S.R. Materials Knowledge Systems in Python—a Data Science Framework for Accelerated Development of Hierarchical Materials. Integr Mater Manuf Innov 6, 36–53 (2017). https://doi.org/10.1007/s40192-017-0089-0