Package schrodinger :: Package application :: Package matsci :: Module jobutils :: Class StringCleaner
[hide private]
[frames] | no frames]

Class StringCleaner

object --+
         |
        StringCleaner

Manages the cleaning of strings.

Instance Methods [hide private]
 
__init__(self, extra_replacement_pairs=None, separator='-')
Populate an instance with some defaults.
str
cleanAndUniquify(self, input_str, clear_prev=False, max_len=100)
Shorten if necessary, replace certain characters in an input string and then uniquify the string by comparing with a dictionary of previous names and number of times used.

Inherited from object: __delattr__, __format__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __sizeof__, __str__, __subclasshook__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, extra_replacement_pairs=None, separator='-')
(Constructor)

 

Populate an instance with some defaults. The replacement dictionary needs to be set such that the most specific replacements occur last. This is because the replacements should be done in a certain order, for example ('C:\', '') should be done before (':', '') and ('\', ''), and because people tend to append to an iterable rather than prepend we will traverse the iterable backwards.

Parameters:
  • extra_replacement_pairs (list of tuples) - each tuple in this list contains a single replacement pair, i.e. a single substring to be replaced and a single substring to replace it.
  • separator (str) - in the case of non-unique strings this is the string that separates the non-unique part from the number of times used which is the unique part.
Overrides: object.__init__

cleanAndUniquify(self, input_str, clear_prev=False, max_len=100)

 

Shorten if necessary, replace certain characters in an input string and then uniquify the string by comparing with a dictionary of previous names and number of times used.

Parameters:
  • input_str (str) - the input string we want cleaned and uniqified
  • clear_prev (bool) - specify if the dictionary of previous names should first be cleared
  • max_len (int) - maximum length of the input_str allowed, otherwise it will be shortened to the max_len value
Returns: str
the input string now cleaned and uniquified