PCA with Multiple Domains Per Shape

I’m a bit confused about the different shapeworks functions for PCA and how to use them for shapes with multiple domains.

There seem to be two different sets of shapeworks functions for PCA: The ParticleShapeStatistics class in the main C++ library which has a partial python interface, and the PCAEmbedder class in the pure python Utils library.

The ParticleShapeStatistics class is used in the shapeworks PCA example, and has an option for multiple domains per shape, but I don’t see how to use it to project a PCA score back into a set of points (not quite sure about terminology here). I’m also confused about the pcaLoadings method of this class, and how it relates to PCA scores. The elipsoide_pca example creates a plot where loadings are used in the y axis, but the label is PCA score. Also, the array returned by ParticleShapeStatistics.pcaLoadings is a square array of size (nsamples,nsamples) which is not what I would expect for loadings or scores.

The PCAEmbedder class seems easier to use and has worked when I tested it with the elipsoid mesh data, but I’m not sure how to modify it to handle multiple domains per shape.

Any clarification woudl be appreciated!

Here is my progress: I used the PCAEmbedder class, but had to merge all my domain points into a single array for each subject first, then re-split them before warping.

        project = sw.Project()
        project.load(shapeworks_project_file)
        subjects = project.get_subjects()

        # Get the local points and global transforms
        all_points = []
        subject_transforms = []
        ref_subject_index = None
        for i, subject in enumerate(subjects):
            # Global transform is the last in the list of subject transforms
            subject_global_transform = np.array(subject.get_groomed_transforms()[-1]).reshape(4, 4)
            subject_transforms.append(subject_global_transform)

            local_point_files = subject.get_local_particle_filenames()
            subject_points = []
            for file in local_point_files:
                domain_points = np.loadtxt(project_directory / file)  # File paths stored relative to project directory
                subject_points.append(domain_points)

            all_points.append(subject_points)

            # Find reference subject by comparing against directory we stored earlier
            if fnmatch(Path(subject.get_groomed_filenames()[0]).parents[1], f"*{ref_dir}"):
                ref_subject_index = i

        # Check the number of points in each domain
        domain_n_points = [len(p) for p in all_points[0]]

        # Merge points for each subject and apply transform
        all_points_transformed = []
        for subject_points, transform in zip(all_points, subject_transforms):
            subject_points = pv.PointSet(list(itertools.chain.from_iterable(subject_points)))
            subject_points_transformed = subject_points.transform(transform).points
            all_points_transformed.append(subject_points_transformed)

        all_points_transformed = np.array(all_points_transformed)

        # Load reference meshes and apply transform
        ref_meshes_transformed = []
        for relative_filename in subjects[ref_subject_index].get_groomed_filenames():
            # File paths stored relative to project directory
            ref_mesh_filename = os.path.normpath(os.path.join(project_directory, relative_filename))
            ref_mesh = sw.Mesh(ref_mesh_filename)
            ref_mesh_transformed = ref_mesh.applyTransform(subject_transforms[ref_subject_index])
            ref_meshes_transformed.append(ref_mesh_transformed)

        # Do PCA
        point_embedder = Embedder.PCA_Embbeder(all_points_transformed, num_dim=0, percent_variability=0.99)

        # Test Results: Check that PCA reconstructed reference meshes match original reference mesh
        # --------------------------------------------------------------------------------------------------------------
        # Project PCA for reference shape
        generated_points = point_embedder.project(point_embedder.PCA_scores[ref_subject_index])

        # Break points back into domains
        generated_points_split = np.split(generated_points, np.cumsum(domain_n_points))
        ref_points_split = np.split(all_points_transformed[ref_subject_index], np.cumsum(domain_n_points))

        # Do warping
        warped_meshes = []
        for ref_mesh, ref_points, gen_points in zip(ref_meshes_transformed, ref_points_split, generated_points_split):
            warper = sw.MeshWarper()
            warper.generateWarp(ref_mesh, ref_points)
            warped_mesh = warper.buildMesh(gen_points)
            warped_meshes.append(warped_mesh)

        # Plot comparison
        p = pv.Plotter()
        for warped_mesh, ref_mesh in zip(warped_meshes, ref_meshes_transformed):
            p.add_mesh(sw.sw2vtkMesh(warped_mesh).extract_all_edges(), color="red")
            p.add_mesh(sw.sw2vtkMesh(ref_mesh).extract_all_edges(), color="blue")
        p.show()

Yes, it seems the Python API for mult-domain PCA is lacking. You have the right approach where you concatenate the domains and then split them up.