Curating Identifiers on ChemSpider


ChemSpider is a free chemistry search engine. It has been built to aggregate and index chemical structures and their associated information into a single searchable repository. In order to curate data, upload structures, add associated information, download search results and use our embedding tools, you need to be a registered user. For some tasks you may need the Curator role. This can be requested after you register an account. The process for registering is described in the help page Registering with ChemSpider.

Contents:

  1. What is Curation?
  2. Posting Comments
  3. Curating Identifiers
    1. Editing Overview
    2. Adding Identifiers
      1. Options for Editing Identifiers
    3. Rejecting and Approving Identifiers
      1. Guidelines for Rejecting/Approving Identifiers

What is Curation?

Curation of the ChemSpider database refers to the manual annotation and correction of data, such as structural information, the nomenclature of chemical entities and the links to publications.

There are two main ways to help curate data on ChemSpider;

  • Registered users with curation rights can review and remove erroneous data or mark it for master curation.
  • Post comments on a record – a Master curator will then review these comments and try to resolve any issues raised.

By highlighting and removing incorrect data from the database you help to improve the quality of data that is available to you and the rest of the scientific community.

Posting Comments on ChemSpider

Any user can post comments regarding erroneous data. This could be an incorrect name or a structure that is incorrectly drawn. If you believe that you have found an error in a record, you can submit feedback by clicking on the Leave Feedback button in the top-right corner of the page.

Database Curation Leave Feedback

A feedback form is then displayed on top of the record.

Database Curation Feedback Captcha

Simply fill in the text box describing the error and provide any reference details(e.g. publication containing the correct information) and suggestions of how this should be corrected. It is important that you supply an e-mail address in order that we may respond but we do respect your privacy and our privacy policy can be found here.

You can also select a Status (Low, Normal, High, Extreme) for the feedback. Finally, you need to complete the CAPTCHA request and select Submit.

Curating Identifiers on ChemSpider

To be able to curate a ChemSpider record you need to have Curator privileges and be logged into your account.

Tip: If you need to request Curator privileges or are unsure that you have them you can check/update your profile by selecting “My Profile” from your “My ChemSpider” menu.

To change any information in the Names and Identifiers section you need to open the editing dialog.

Overview of the Identifiers editing dialog

Open the Names and Identifiers editing dialog by clicking on the Identifier label at the top right of the record.

Alternatively, if the record already has some identifiers you can click on the Edit button in the Names and Identifiers tab,

Database Curation Names and Synonyms

The editing dialog displays a list of the identifiers in the record together with a set of check boxes that allow you to select the identifiers that you want to change. As a Curator it is not possible to make changes to validated names (indicated in bold face). At the top of the dialog you can see:

  • Selection tools – These allow you to: Select All, Deselect All or Invert the Current Selection.
  • Update button – This is used to choose how the state of the selected identifiers should be changed giving you 4 options: Reject, Normal, Confirm and Redirect.
  • Add button – This is used to add new identifiers

Below these there is a key which reminds you of the different styles of text formatting that are used to indicate the current status of an identifier.

In the main body of the dialog you can see: the Identifiers are grouped with Validated identifiers appearing at the top of the list, followed by Normal identifiers and Rejected identifiers appearing at the bottom of the list. Within these groupings the individual identifiers are listed in alphabetical order.

Database Curation Names and Synonyms bottom

At the very bottom of the dialog there are buttons to Save your changes or Cancel them.

Adding an Identifier

To add a new identifier to a record open the Identifiers editing dialog (described in the previous section), and click on the Add button.

Database Curation Add Synonym

The identifier entry box will appear, type (or paste) the identifier into the Synonym field and select any appropriate check boxes that further define the nature of the identifier(See Guide to the options for adding synonyms below). You can also mark the synonym as approved by selecting the checkbox in the bottom of the dialog.

Note: Certain characters will not be displayed correctly and should not be used (for instance, Greek characters generated by using the Symbol font face).

When you have finished, click on the Add button and this will return you to the identifiers editing dialog. You should now be able to see the identifier that you added.

To save your changes, click on the Save button and the identifiers editing dialog will close.

A comment box will pop up for the further addition of comments which would be helpful to a master curator when reviewing the suggested identifiers.

Click Ok when complete to return you to the main record and an e-mail will be sent to the Master Curators for review and approval.

Guide to the options for adding synonyms

If the name entered is in a foreign language then the drop-down menu can be used to select the language

Rejecting and Approving Identifiers

The process of rejecting or approving identifiers involves the selection of one or more identifiers and then specifying what the new state of these identifiers should be. A common selection is to reject an identifier since it is incorrectly associated with the structure.

There are four states:

  • Rejected – displays the identifier with a strike out line to indicate that it is not match the compound
  • Normal – returns a name to a normal state
  • Confirm – confirms that the identifier is appropriately matched with the compound
  • Redirect – offers the user to associate the compound with another ChemSpider ID (for example, when there are two tautomers or isomers that the users would like to connect)

Rejecting or approving an identifier is performed in similar way to adding an Identifier:

  1. Open the Identifiers editing dialog.
  2. Select the identifier(s) that you wish to change.
  3. Click on the Update button and this brings up a dialog box which allows you to select the state that you want to apply to the selected identifiers.
  4. Click Ok. This returns you to the Identifiers editing dialog – the altered identifiers will have changed their position in the list of identifiers (approved go to the top of the list, rejected to the bottom) and will be formatted to display their new state.
  5. Click Save. You will be prompted to supply comments that explain your changes to the Master Curator when reviewing your curation.

State changes can be done on groups of identifiers at one time. However, it is necessary to separately approve or reject in separate operations. It is not necessary to save the state changes between these operations.

Guidelines for Removal and Approval of Identifiers

What we are trying to achieve with the actions of approval or rejection of identifiers is state changes which will assist the master curators in speeding up the process of database cleansing. Master Curators have the responsibility of moving curated identifiers to a final approval state of Confirmed or Deleted identifiers based on further research work, reversing the changes or leaving in their present state.

The intention is to remove the associations between structures and identifiers that cause confusion, mislead chemists in their understanding of the chemical structure and provide clarification.

There are various confusions requiring clarification.

Specifically, these are:

  • All systematic names should match the structure as drawn. All stereochemistry in the name must be represented in the structure shown.
  • Any systematic name should be adequate enough to unambiguously convert the name to the matching structure.
  • Any Registry numbers must be for the compound as shown. If the compound shown is the neutral base compound then the registry numbers should not be for the sodium salt or the chloride salt for example.
  • Identifiers are not meant to be descriptors per se. For example, the description “One of a series of hexamethylcyclohexanes” is not a good identifier. Duplicates can be subtly different but do need to be curated.

Advertisement