Home > cellorganizer > utilities > img2model.m

img2model

PURPOSE ^

IMG2MODEL Trains a generative model of protein subcellular location from a

SYNOPSIS ^

function model = img2model( dimensionality, varargin )

DESCRIPTION ^

IMG2MODEL Trains a generative model of protein subcellular location from a
collection of microscope images.

Inputs
dimensionality    either '2D' or '3D'
param             a structure holding possible parameter options

See also IMG2SLML

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SUBFUNCTIONS ^

SOURCE CODE ^

0001 function model = img2model( dimensionality, varargin )
0002 %IMG2MODEL Trains a generative model of protein subcellular location from a
0003 %collection of microscope images.
0004 %
0005 %Inputs
0006 %dimensionality    either '2D' or '3D'
0007 %param             a structure holding possible parameter options
0008 %
0009 %See also IMG2SLML
0010 
0011 % Ivan E. Cao-Berg
0012 %
0013 % Copyright (C) 2007-2013 Murphy Lab
0014 % Carnegie Mellon University
0015 %
0016 % ?? ??, 2011 I. Cao-Berg Added 3D model training functionality
0017 % March 23, 2012 I. Cao-Berg Added the creation/deletion of temporary folder
0018 % March 28, 2012 I. Cao-Berg Added a control structure under which if one or more of the
0019 %                image folders are nonexistent or they do not contain images,
0020 %                the method exits
0021 % March 28, 2012 I. Cao-Berg Added verification of input arguments when training a 2D
0022 %                            generative model
0023 % March 28, 2012 I. Cao-Berg Added verification of input arguments when training a 3D
0024 %                            generative model
0025 % April 10, 2012 I. Cao-Berg Added debug flag to the method. If flag is true, temporary
0026 %                            files will not be deleted
0027 % April 11, 2012 I. Cao-Berg Added verbose flag to the method
0028 % April 17, 2012 I. Cao-Berg Returns an empty model when model cannot be trained
0029 % July 5, 2012 I. Cao-Berg Added training flags to method so that users can train whatever component they wish
0030 % July 26, 2012 Y.Yu Fixed a bug of the order of input argument for ml_traingenmodel2D method
0031 % August 29, 2012 G. Johnson Modified method call to include parameter structure
0032 % May 7, 2013 I. Cao-Berg Included support of masks when training a model
0033 % only for the 2D case
0034 % May 8, 2013 I. Cao-Berg Removed check for the existence of image folder since the check will happen later in the code
0035 % May 15, 2013 I. Cao-Berg Updated method to support wildcards
0036 %
0037 %%
0038 % June 7-13 2013 D. Sullivan Major refactoring to support parallel/per-cell
0039 %                            parameter calcultaions for 3D
0040 %%
0041 %
0042 % Jul 22, 2013 G. Johnson    Added parameter to skip preprocessing entirely
0043 %                            and use only currently existing preprocessing
0044 %                            results
0045 % Aug 2, 2013 G. Johnson     Fixed logic so that prot_image_files are not
0046 %                            overwritten by an empty cell array if they exist
0047 % Aug 2, 2013 G. Johnson     Implemented chunk_start parallelization on
0048 %                            per-cell parameterization
0049 % Aug 30, 2013 G. Johnson    Changed they way files are input into the
0050 %                            diffeomorphic model function
0051 %
0052 % This program is free software; you can redistribute it and/or modify
0053 % it under the terms of the GNU General Public License as published
0054 % by the Free Software Foundation; either version 2 of the License,
0055 % or (at your option) any later version.
0056 %
0057 % This program is distributed in the hope that it will be useful, but
0058 % WITHOUT ANY WARRANTY; without even the implied warranty of
0059 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
0060 % General Public License for more details.
0061 %
0062 % You should have received a copy of the GNU General Public License
0063 % along with this program; if not, write to the Free Software
0064 % Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
0065 % 02110-1301, USA.
0066 %
0067 % For additional information visit http://murphylab.web.cmu.edu/ or
0068 % send email to murphy@cmu.edu
0069 
0070 % Get collection of images
0071 nuclearModelImages = [];
0072 cellModelImages = [];
0073 proteinModelImages = [];
0074 
0075 model = [];
0076 switch lower(dimensionality)
0077     case '2d'
0078         %icaoberg march 28, 2012
0079         %check number of input arguments. if there are not 5 input
0080         %arguments, then the method returns an empty model
0081         if nargin ~= 5
0082            warning('CellOrganizer: Wrong number of input arguments.');
0083            return;
0084         end
0085 
0086         %check the image directories. if they don't exist or they are
0087         %empty, the method returns an empty model
0088         nuclearModelImagesDirectory = varargin{1};
0089 
0090         cellModelImagesDirectory = varargin{2};
0091  
0092         proteinModelImagesDirectory = varargin{3};
0093     
0094         nuclearModelImages = ml_ls( ...
0095             nuclearModelImagesDirectory );
0096         cellModelImages = ml_ls( ...
0097             cellModelImagesDirectory );
0098         proteinModelImages = ml_ls( ...
0099             proteinModelImagesDirectory );
0100 
0101         
0102         %check that the input parameter is a structure
0103         param = varargin{4};
0104         if ~isa( param, 'struct' )
0105            warning('CellOrganizer: Input argument parameter must be a struct.');
0106            return;
0107         end
0108 
0109         %grj 14/5/2013
0110         debug = setDebugParam(param);
0111         verbose = setVerboseParam(param);
0112         
0113         %icaoberg 7/5/2013
0114         try
0115             masks = ml_ls( ...
0116             param.masks );
0117         catch
0118             masks = [];
0119         end
0120             
0121         
0122         %check existence of temporary folder and make on if it doesn't exist
0123         if ~exist([pwd filesep 'temp' filesep 'preprocessed'])
0124           mkdir([pwd filesep 'temp' filesep 'preprocessed']);
0125         end
0126         
0127         % train generative model of protein subcellular location
0128         param.disp = 'false';
0129         model = ml_traingenmodel2D( ...
0130             proteinModelImages,...
0131             nuclearModelImages, ...
0132             cellModelImages, ...
0133             masks, param );
0134     case '3d'
0135         %icaoberg march 28, 2012
0136         %check the existence of the image directory
0137         dnaImagesDirectoryPath = varargin{1};
0138         cellImagesDirectoryPath = varargin{2};
0139         proteinImagesDirectoryPath = varargin{3};
0140         param = varargin{4};
0141         if ~isa( param , 'struct' )
0142             warning('CellOrganizer: Input parameter list must be a structure.');
0143             return
0144         end
0145 
0146         %grj 14/5/2013
0147         debug = setDebugParam(param);
0148         verbose = setVerboseParam(param);
0149         
0150         %mmackie july 3 2012
0151         try
0152             trainFlag = param.train.flag;
0153             if ~isa( trainFlag, 'char' );
0154                 error('CellOrganizer: training flag must be a string');
0155             end
0156         catch
0157             param.train.flag = 'all';
0158             trainFlag = param.train.flag;
0159         end
0160         
0161         %icaoberg july, 5 2012
0162         if ~strcmpi( trainFlag, 'nuclear' ) && ...
0163            ~strcmpi( trainFlag, 'framework' ) && ...
0164            ~strcmpi( trainFlag, 'all' )
0165             error('CellOrganizer: Unrecognized training flag');
0166         end
0167         
0168         %grj 7/26/13 - Check to see eith cell or nucleus are diffeomorphic
0169         % and if so, use the diffeomorphic model
0170         if strcmpi(param.nucleus.type, 'diffeomorphic') | ...
0171                 strcmpi(param.cell.type, 'diffeomorphic')
0172             isdiffeomorphic = true;
0173         else
0174             isdiffeomorphic = false;
0175         end
0176         
0177         %%%%%%%%
0178         %D. Sullivan 6/5/13 refactor the code to produce cell
0179         %parameterizations first and then create models. Can put this into
0180         %separate functions once complete.
0181         
0182         dna_image_files = ml_ls( dnaImagesDirectoryPath );
0183         cell_image_files = ml_ls( cellImagesDirectoryPath );
0184         prot_image_files = ml_ls( proteinImagesDirectoryPath );
0185         
0186         
0187         %icaoberg 7/3/2013
0188         if isempty( dna_image_files )
0189             warning('Could not find any images in the DNA images directory. Using DNA hole finding.' );
0190             dna_image_files = cell(size(cell_image_files));
0191 %             model = [];
0192 %             return
0193         end
0194         
0195         if isempty( cell_image_files )
0196             warning('Could not find any images in the cell images directory. Exiting method.' );
0197             model = [];
0198             return
0199         end
0200 
0201         %grj 8/2/13 fixed logic so that prot_image_files are not
0202         %overwritten by an empty cell array if they exist
0203         if isempty(prot_image_files)
0204             if strcmpi(param.train.flag,'framework')
0205                 prot_image_files = cell(size(cell_image_files));
0206             else
0207                 warning('Could not find any images in the protein images directory. Exiting method.' );
0208                 model = [];
0209                 return
0210             end
0211         end
0212         
0213         %D. Sullivan 6/5/13 get all the masks if they exist
0214         if isfield(param,'masks')
0215             mask_image_files = ml_ls(param.masks);
0216         else
0217             mask_image_files = cell(1,length(dna_image_files));
0218         end
0219         
0220         numimgs = length(cell_image_files);
0221         
0222         %%%%%%
0223         %D. Sullivan June/2013 - Refactoring to per-cell oriented feature
0224         %extraction
0225         %setup param folders
0226         param = set_temp_result_folders(param);
0227         param = ml_initparam(param,struct('downsample',[1,1,1], 'preprocess', true, 'display', false));
0228         
0229         %D. Sullivan 6/7/13-6/13/13
0230         %Do all the cells in parallel
0231         %GRJ 6/17/13
0232             % changed the percellparam_(for/parfor) to work on single
0233             % images to improve maintainability
0234         startmodel = true;
0235         if param.preprocess
0236             if ~isfield(param,'parallel') || param.parallel>=1
0237                 try 
0238                     matlabpool('open', param.parallel)
0239 
0240                     parfor i = 1:numimgs
0241                         percellparam(dna_image_files{i},cell_image_files{i},...
0242                         prot_image_files{i},mask_image_files{i}, i, param)
0243                     end
0244                     parallelflag = true;
0245 
0246                 catch
0247                     disp('Parallel code failed, trying linear method');
0248                     parallelflag = false;
0249                 end
0250             else
0251                 parallelflag = false;
0252             end
0253 
0254             %grj Implementing chunk_start to perform parallel computing
0255             if ~parallelflag
0256                 cellCounter = [];
0257                 c = 1;
0258                 for i = 1:numimgs
0259                     tmpfile = [param.tempparent filesep 'image_lock_' num2str(i)];
0260                     [startimage, ~, ~, tmpfile] = chunk_start(tmpfile);
0261                     
0262                     if startimage
0263                         try
0264                             percellparam(dna_image_files{i},cell_image_files{i},...
0265                                 prot_image_files{i},mask_image_files{i}, i, param)
0266                         catch
0267                             disp(['Skipping image ' num2str(i) ' due to error'])
0268                         end
0269                             chunk_finish(tmpfile)
0270                     else
0271                         disp(['Image ' num2str(i) ' currently being operated on. Skipping.']);
0272                         startmodel = false;
0273                         
0274                         cellCounter(c) = i;
0275                         tmpCounter{c} = tmpfile;
0276                         c = c+1;
0277                     end
0278                     
0279                 end
0280             end
0281         else
0282             disp('Preprocessing flag set to false. Using only currently existing preprocessing results in the temp directory')
0283         end
0284         
0285         if ~startmodel
0286             %check to see if the tmp files finished just incase
0287             if any(cellfun(@(x) exist(x, 'file'), tmpCounter))
0288                 disp('The following images are still processing:')
0289                 for i = 1:length(cellCounter)
0290                     disp([num2str(cellCounter(i)) ': ' tmpCounter{i}])
0291                 end
0292                 
0293                 %if the model is diffeomorphic, we can still build
0294                 %intermediate parts with the data that currently exists
0295                 if ~isdiffeomorphic
0296                     model = [];
0297                     return;
0298                 end
0299 
0300                 
0301             end
0302         end
0303         
0304         %D. Sullivan 6/17/13
0305         %With the percell features computed, visualize the distributions of
0306         %some interesting parameters
0307         if isfield(param,'percellreport') && ...
0308             (param.percellreport == true || param.percellreport == 1)
0309             
0310             model2report_percell(param);
0311         end
0312 
0313         %Now load the relevant files and create single cell arrays for each
0314         %compartment
0315  
0316         %D. Sullivan 6/12/13 all this is now taken care of by set_temp_result_folders.m
0317         %check existence of temporary folder and make on if it doesn't exist
0318 %         if ~exist( [ pwd filesep 'temp'], 'dir' )
0319 %             mkdir( [ pwd filesep 'temp'] );
0320 %         end
0321         
0322         try
0323             %icaoberg april 17, 2012
0324             model.dimensionality = '3D';
0325             
0326             %gj jul 23, 2013 add diffeomorphic model
0327             if isdiffeomorphic
0328                 
0329                 diff_model = train_diffeomorphic_model(param, numimgs);
0330                 if strcmpi( trainFlag, 'all' ) | strcmpi(trainFlag, 'framework');
0331                     model.nuclearShapeModel = diff_model;
0332                     model.cellShapeModel = diff_model;
0333                 elseif strcmpi(trainFlag, 'nuc')
0334                     model.nuclearShapeModel = diff_model;
0335                 elseif strcmpi(trainFlag, 'cell')
0336                     model.cellShapeModel = diff_model;
0337                 end
0338             
0339             else
0340                 if verbose
0341                    %clc;
0342                    fprintf( 1, '%s\n', 'Training nuclear shape model' );
0343                 end
0344 
0345                 %D. Sullivan 6/12/13 refactored to use per-cell parameters
0346                 %Nuclear model
0347                 if ~exist([param.tempparent filesep 'nuc_model.mat'],'file')
0348                     model.nuclearShapeModel = train_nuc_shape_model( param.nuctemppath,...
0349                         param.tempparent,param );
0350                 else
0351                     load([param.tempparent filesep 'nuc_model.mat']);
0352                     model.nuclearShapeModel = nuclearShapeModel;
0353                 end
0354                 %gj aug 29, 2012
0355                 %passes in 'param' now
0356     %             model.nuclearShapeModel = train_nuc_shape_model( ...
0357     %                 dnaImagesDirectoryPath, ...
0358     %                 cellImagesDirectoryPath, ...
0359     %                 proteinImagesDirectoryPath, ...
0360     %                 param );
0361                 %D. Sullivan 6/12/13 refactored to use per-cell features
0362                 %Cell model
0363                 if strcmpi(param.train.flag,'all')||strcmpi(param.train.flag,'framework')
0364                     fprintf( 1, '%s\n', 'Training cell shape model' );
0365                     if ~exist([param.tempparent filesep 'cell_shape_model.mat'],'file')
0366                         model.cellShapeModel = train_cell_shape_model3(param.celltemppath,...
0367                            param.tempparent);
0368                     else
0369                         load([param.tempparent filesep 'cell_shape_model.mat']);
0370                         model.cellShapeModel = cellShapeModel;
0371                     end
0372                 end
0373                 %mmackie july 3, 2012
0374     %             if strcmpi( trainFlag, 'framework' ) || strcmpi( trainFlag, 'all' )
0375     %                 fprintf( 1, '%s\n', 'Training cell shape model' );
0376     %                 model.cellShapeModel = train_cell_shape_model2( ...
0377     %                     dnaImagesDirectoryPath, ...
0378     %                     cellImagesDirectoryPath, ...
0379     %                     proteinImagesDirectoryPath, ...
0380     %                     param );
0381     %             end
0382 
0383                 if strcmpi( trainFlag, 'all' )
0384                     if verbose
0385                       %clc;
0386                       fprintf( 1, '%s\n', 'Training protein model' );
0387                     end
0388                     %D. Sullivan 6/12/13 refactored to use percell features
0389                     %Note: param contains all temp path info already from
0390                     %set_temp_result_folders.m
0391                     %Prot model
0392                     if ~exist([param.tempparent filesep 'protmodel.mat'],'file')
0393                         model.proteinShape = train_protein_model2( param );
0394                     else
0395                         load([param.tempparent filesep 'protmodel.mat']);
0396                         model.proteinShape = proteinShape;
0397                     end
0398     %                 %D. Sullivan 2/22/13 added param structure to pass resolution
0399     %                 model.proteinShape = train_protein_model( ...
0400     %                     dnaImagesDirectoryPath, ...
0401     %                     cellImagesDirectoryPath, ...
0402     %                     proteinImagesDirectoryPath, ...
0403     %                     param );
0404                 end
0405 
0406             end
0407             %grj 7/9/13 check for fields so the model wont crash
0408             
0409            %icaoberg 22/02/2013
0410             if isfield(param.model, 'original_resolution')
0411                 model.info.original_resolution = param.model.original_resolution;
0412             else
0413                 model.info.original_resolution = 'n/a';
0414             end
0415             
0416             if isfield(param.model, 'downsampling')
0417                 model.info.downsampling_vector = param.model.downsampling;
0418             else
0419                 model.info.downsampling_vector = 1;
0420             end
0421            %D. Sullivan 6/12/13 removed. already set and misspelled
0422 %            model.nuclearShapeModel = struct('resoluton', param.model.resolution);
0423             if isfield(param.model, 'resolution')
0424                 model.cellShapeModel.resolution = param.model.resolution;
0425             else
0426                 model.cellShapeModel.resolution = 'n/a';
0427             end
0428            %D. Sullivan 2/24/2013 This should be set already in
0429            %train_protein_model
0430            %model.proteinShape.resolution = param.model.protein_resolution;
0431 
0432         catch err
0433            %icaoberg april 17, 2012
0434            %returns empty model if model cannot be trained
0435                      
0436            model = [];
0437            warning('CellOrganizer: Unable to train 3D generative model.');
0438           
0439            %icaoberg 06/02/2013
0440            if debug
0441                getReport( err, 'extended' )
0442            end
0443            
0444            if ~debug
0445               rmdir( 'temp', 's' );
0446            end
0447            
0448            return
0449         end
0450     otherwise
0451         warning(['Unknown dimensionality ' ...
0452             dimensionality '. Exiting method.' ]);
0453         model = [];
0454         return
0455 end
0456 
0457 end%img2model
0458 
0459 function [debug] = setDebugParam(param)
0460 %set debug default to true
0461     try
0462         debug = param.debug;
0463         if ischar(debug)
0464             if strcmpi(debug, 'false')
0465                debug = false;
0466             else 
0467                debug = true;
0468             end
0469         else
0470             debug = logical(debug);
0471         end
0472     catch
0473         debug = true;
0474     end 
0475 end
0476 
0477 function [verbose] = setVerboseParam(param)
0478 %set verbose default to true
0479     try
0480         verbose = param.verbose;
0481         if ischar(verbose)
0482             if strcmpi(verbose, 'false')
0483                verbose = false;
0484             else 
0485                verbose = true;
0486             end
0487         else
0488             verbose = logical(verbose);
0489         end
0490     catch
0491         verbose = true;
0492     end 
0493 end

Generated on Sun 29-Sep-2013 18:44:06 by m2html © 2005