The Norm Program
The lexical program, Norm, generates the normalized strings for terms included in the SPECIALIST Lexicon. The normalization process involves stripping possessives, replacing punctuation with spaces, removing stop words such as "No Other Specification" or NOS, lower-casing each word, breaking a string into its constituent words, and sorting the words in alphabetic order.
Below is an example of the normalization process for the term Hodgkin's diseases, NOS.
Process | Outcome |
---|---|
Remove genitive | Hodgkin diseases, NOS |
Remove stop words | Hodgkin diseases, |
Lowercase | hodgkin diseases, |
Strip punctuation | hodgkin diseases |
Uninflect | hodgkin disease |
Sort words | disease hodgkin |
The Norm program is used in systems to:
- Find similar terms
- Map terms to UMLS concepts
- Find lexical variants for a term
Download Lexical Tools, including the Norm program
Last Reviewed: July 29, 2016