User Guide

Creating a Plot

Selecting the Data to plot

There are two ways to create plots within Sentira. One way is to select a plot-type from the buttons on the right-hand toolbar, after which you can then choose which properties to show on each axis. Alternatively, if you select the property columns in the data set that you would like to include within a plot, then an appropriate plot will be created which shows those columns. Depending upon the number, and type, of columns that you have selected an appropriate plot-type will be chosen for you. However, if you would like to see a different plot for these properties then you can simply change the plot type by using the plot buttons on the right-hand toolbar.

If you have opened or created more than one data set, you also include the data from those other data sets in the plot you have created. All the data sets will be listed in the key below the plot. Those which contain the properties that are displayed in the plot will be enabled and simply checking the box next to the name of the data set you would like to include will add it to the plot. The up and down arrows next to the key control the order in which points from the different data sets are layered within the plot. You can control whether a data set is on top by selecting its name in the key and clicking the up or down arrows to change its position.

Plotting | Changing Plot Types

Sentira contains a number of different plot types that you might wish to create:

Histogram (1D or 2D)

Pie Chart

Scatter Plot (2D or 3D)

SAR Table

Radar Plot

Box Plot

ROC Curve

Each plot type is represented by its own button on the right-hand toolbar making it easy to choose which you would like to create.

Interacting with Your Plot

You can click on different parts of a plot to see which compounds are represented. Whenever you select one or more points in a plot, these are also highlighted in the data set. In a scatter plot you can click on individual points or draw a region to select all the points within. For a histogram, box plot or pie chart you can click on a section to see which compounds it represents. You can add to the current selection by holding down the Ctrl button and selecting further compounds

Creating Multiple Plots

If you would like to see more than one plot you can click the detach button to “detach” a copy of the visualisation area. You can do this as many times as you wish and can continue to change plot types or the data displayed within them. While the main visualisation area will change plot types if you select different columns (as described above) detached plots will not, ensuring that any plot you have detached will remain unchanged until you modify it or the data sets themselves.

All plots, including those that have been detached, remain linked to your data. Whenever you select rows in a data set, the corresponding points or sections in any plots you have created will be highlighted. If you select a point or section in a plot then the corresponding rows in the data set(s) will be highlighted, along with the corresponding points or sections in any other plots.

Customising Your Plot

Changing Colours and Sizes

You can choose the colour and size for each data set displayed in your plot and, if desired, use these to represent other properties. Click on the colour or size in the data set key below the plot to use the Format by Property dialogue. You can configure the colours on the Colour tab and the sizes on the Size tab. On either tab, if you choose a property from the list then you can specify how to colour/size the different values represented by the property. If the property is numerical then you can either interpolate between two extreme values or you can specify bins. A numerical property can also be used with a log scale. When interpolating you can choose the ends of the range over which to interpolate. When binning you can edit the number of bins and the range of values that each one represents. For categorical data you can choose a colour/size for each category. When colouring by a property, grey is used to represent missing data.

For plots which represent groups of compounds e.g. histogram or pie chart, if you use interpolative colour/size for a property then the average property value of all the compounds represented by that section will be used to set the colour/size. However, if you bin the data then the section will be coloured/sized as a set of appropriately sized subsections, one for each bin that contains data. Note: sizing by a property has no effect on histograms, box plots, radar plots or ROC curves.

Trellising

As well as using colour and size to add dimensions to your plots, you can also trellis them. This enables you to split the plot area into a series of smaller plots based upon property criteria that you choose. Trellised plots share the same axes ranges making them easy to compare. To trellis a plot, select a property from the list. Click the trellis button to choose the number of trellises and to customise the ranges represented by each one.

Interactive Filtering

If you wish to remove data points from a plot, click on the Filters tab (below the plot) and then add a filter for each property you’d like to use as a filter. For numerical properties you can use sliders to filter the data. The sliders represent the range of values that are displayed and as you move them data points outside the range will be filtered out. For categorical properties you will see a check box for each possible category which you can uncheck to filter those categories.

Changing Axes, Titles, Tick-mark Labels and Background

You can modify the axis ranges in a number of ways. Using the mouse-wheel you can zoom into or out of a plot, which you can also do using the + and – keys. The cursor buttons can be used to pan a plot in any direction. Using the mouse, you can grab a tick mark and drag it to zoom into or out of one axis. If you right-click on the plot you can choose Edit axes... from the menu. This will display a dialogue in which there is one tab for each axis on the plot. For all plots except the SAR table (see below) you can set the maximum and minimum values for each axis. For appropriate plots you can also specify the tick marks to be displayed along with the intercept. For some axes you can also indicate that the axis should be displayed using a logarithmic scale. For a radar plot you can indicate that an axis should be inverted. For categorical axes you can choose which categories are displayed and a Hide Empty button helps you to automatically remove any categories which have no values.

You can edit axis labels and the title by simply clicking on them and editing the text. If you would like to change the fonts used for these (as well as other text items in the plot), right-click on the plot and choose Change Fonts/Colours... from the menu.

If you wish to change the background colour of the plot, right-click on the plot and choose Change Background...

Further Customisations

There are a number of ways to customise the plot area which are available on the Customise tab below the plot. The following options are available for the different plot types:

Histogram:

  • Display a grid in the background
  • If the histogram is 1D then you can choose whether additional data sets are stacked or shown side by side
  • If the histogram is 2D you can show error bars on the Y axis

Pie chart:

  • When viewing multiple pie charts in a trellis or SAR table you can choose whether the individual pies are all a constant size or sized relative to the number of compounds represented by each pie

Scatter plots:

  • Display a grid in the background
  • Display error bars for any of the properties
  • Display only compounds that are selected
  • Jitter the data points
  • If the scatter plot is 2D and both properties are numerical then you can choose to display an identity line or a regression line
  • If the scatter plot is 3D then you can choose whether the axes origin is in the corner, in the centre, or whether to hide the axes altogether

SAR table:

  • Choose whether to see pie charts, histograms or radar plots within each cell
  • When displaying histograms choose which property to visualise
  • When displaying radar plots choose which properties to include

Radar plot:

  • Display only compounds that are selected

Box plot:

  • Display a grid in the background
  • Choose the percentile to be represented by the whiskers

ROC Curve:

  • Display a grid in the background

Annotating Your Plot

Annotating Plots with Compound Structures and Data

In plots which show points representing individual compounds, such as a scatter plot (or even the outliers in a box plot) you can add individual annotations to points. If you right-click on a point you can choose to Add annotation... and then select one or more properties to display next to the point. You can also choose to display the chemical structure. The annotation is drawn with a visible connection to the point so that you can move it to an appropriate space within the plot.

In plots which display summary information about groups of compounds, such as histograms, box plots or SAR tables you can add summary information about properties in the form of annotations. To annotate an individual bar in a histogram or box plot, right-click on the bar and choose Add annotation... You can then select one or more properties in the dialogue that appears.

You can choose annotations for all items in these plots by clicking the Annotations... button at the bottom of the Customise tab.

Text Labels

You can also add text labels anywhere within plots. If you right-click on a point in a scatter or box plot, or a bar within a histogram or box plot, and choose Add label then a label will be attached to that item. Right-clicking and choosing Add label anywhere else will enable you to add an unattached label. To edit the label, simply click on it and change the text. You can move any label by dragging it.

Using Your Plots in Reports and Presentations

Copying Plots into Documents and Presentations

If you would like to add the plots you create into presentations or documents then you can simply copy them by right-clicking and choosing Copy Image from the menu. Alternatively, you can save the image as a file to use later by choosing Save Image... from the menu instead.

Saving Plot Templates

You can save plot templates by clicking the save button and choosing a name for the template. A plot template stores the type of plot, properties, axis labels, title and any general annotations (i.e. those that have been set by clicking the Annotations... button on the Customise tab). To load a plot template click on the load button and select the template file.

Analysing Structure Activity Relationships

R-group Decomposition Tool

You can analyse the SAR of your compounds and link structural and property variations in your chemical series. Sentira’s R-group decomposition tool enables you to analyse a chemical series to visualise the impact of variations to R-groups, linkers, atoms or fragments on compound properties.

To start the R-group decomposition wizard, select an example molecule from the data set and click R-Group Decomposition... from the SAR menu. If no molecule is selected, the first one in the data set will be used.

Hold down the left mouse button and draw round the scaffold structure to select it. If there is more than one scaffold region, hold the Shift or Ctrl key down while selecting the second region. Having done this, click the Next button to confirm the R positions. The selected parts of the molecule will remain visible, but the other regions will be displayed as R-groups. If multiple regions were selected then the region joining them will be displayed as a linker.

To specify additional R-group positions select the R button  and click at the positions on the molecule where these may occur. To change the name of an R-group, select it and type in the new name.

To specify variable atoms, select the X button and then click the atom positions which may vary within the data set.

To remove an R-group or variable atom, select the eraser button and then click on the R-group you wish to remove.

If you choose to Use Strict Matching then hydrogen atoms will be included in the scaffold definition. You can use strict matching to avoid matching scaffolds that have additional R-groups. Click the Next button to give the analysis a name and then click Finish.

Once the wizard has finished, the R-group decomposition results will be added to the data set. Each R-group position, variable atom or linker will be given its own column in the data set. When calculating the R-groups, where there is a choice over R- group column assignment the following properties are used, in sequence, to determine the order of R-group assignment: aromatic bond count, polar atom count and molecular weight.

If you would like to see the scaffold structure at any time, hover the mouse over one of the R-groups. Alternatively, right-click on the header of one of the R-group columns and select menu option View Scaffold. You could also select View Scaffold from the Tools, R-Groups menu. This will create a new window which shows the scaffold. This window can be docked in different places around the sides of the main window. If there are multiple R-group analyses in your data set then the displayed scaffold will match whichever R-group analysis column that you have selected.

Visualising SAR

You can visualise the SAR by selecting the SAR table button and then choosing the R-group columns to use for the axes. Selecting two R-group columns and then colouring the plot based upon a property enables you to explore relationships between structural characteristics and a property.

Matched Molecular Pairs analysis

After completing an R-group decomposition you can run a Matched Molecular Pairs (MMP) analysis to identify pairs of molecules that have only a single point of variation. To do this simply right-click on an R-group column and select Matched Pairs Analysis... Alternatively, select Matched Pairs Analysis... from the Tools, R-Groups menu. The Matched Pairs Analysis dialogue confirms the R-group decomposition on which the matched pairs analysis is to be based. Running the analysis will add new columns to the data set. One MMP column is added for each R-group position.

In this simple example, the ‘MMPAnalysis1 R1’ column identifies matched molecular pairs that differ only at the R1 position and the ‘MMPAnlaysis1 R2’ column identifies matched molecular pairs that differ only at the R2 position. Molecules that have matched pairs are assigned an ‘MMP group’. Any two molecules that are in the same MMP group (for that column) are a matched pair and have all other substituents in common.

You may wish to sort the data set using an MMP column. You may also wish to use the MMP columns to create plots. One suggestion is to create a scatter plot of a property of interest, such as pKi, against an R-group position such as R1 to see how particular substituent changes can affect a property. You can then colour the plot by the MMP analysis column for R1 to highlight trends that may exist between matched molecular pairs.

In this plot of pKi versus R1 the plot has been coloured by the MMP analysis results column for position R1. We can see that, on average, the change H ? Cl, may have an effect to increase pKi. Colouring the plot in this way allows us to focus on cases where the change H ? Cl is the only change made because we can identify trends by looking at the differences in pKi values between points of the same colour. The plot shows that for every case where H was replaced by Cl at R1, and all other substituents were constant, the effect was to increase the value of pKi.

An alternative plot that you may find useful is to plot the property of interest against an MMP column, as shown to the left. This is coloured by the substituent at the corresponding position.

In this case, a substituent that corresponds to a consistently high, or low, property value for multiple MMP groups may indicate a strong influence of that substituent on the property. The colours of the points may help to identify such a group. However with many different R-groups, this can be difficult to see. In this case, you may find it useful to use a second plot of substituents at R1 to select combinations of R-groups and use the option to plot only selected data (available in the Customise options). This allows smaller numbers of R-groups to be compared interactively.

Note: Because the MMP analysis results relate to pairs of molecules, adding molecules to, or removing molecules from, a data set can invalidate the MMP results. If you add or remove molecules you should update the results by running the analysis again.

Managing Your Data

Changing the View of Your Data Set

There are two different views that can be used for browsing the information in a data set. The table view shows all the information in a table where each row displays a different compound and each column displays a different data type. The molecule view shows one compound at a time to make it easier to see all the properties for a single molecule. On the tool-bar below the data set are two buttons that can be used to switch between these views. The table button switches to table view and the molecule button switches to molecule view.

In molecule view you can scroll the molecules left and right using the buttons or clicking either side of the molecules. The data below will change to display values for the current molecule. You can drag and drop the data items in the grid below to arrange them as desired and clicking on a molecule will select/deselect it.

Organise your data

You can change the order in which properties are displayed by clicking the organise button  to bring up the Organise Data dialogue. Drag and drop properties (or highlight them and click the up and down buttons) to change the order in which properties are displayed. Any unchecked properties will be hidden.

Freezing columns and rows

If you wish to keep individual columns visible in your data set as you scroll through the table, simply select the column(s) of interest and click the freeze column button . The chosen columns will be duplicated and displayed to the left of your data set so that they remain visible as you scroll left and right. The equivalent is also possible for rows where, to make it possible to have rows and columns frozen simultaneously, there is an equivalent freeze row button. .

Sorting and Searching

If you would like to sort a data set based on a particular property, simply click the right mouse button on the column title to display the menu and select Sort followed by either Ascending or Descending. Alternatively you can sort the data based on their standard deviation or probability by clicking the right mouse button on the column title to display the menu and selecting Sort by confidence followed by either Ascending of Descending.

If you would like to sort the data set based on multiple properties then click the sort button . In the Sort Data dialogue select the first column you wish to sort on and then specify whether you wish to sort by the values or by the confidence (standard deviation or probability depending on the data type) and whether you wish to sort in ascending or descending order. Click the Add button to sort by a second column. Select a row and click the Delete button if you wish to remove a column from the list. You can start again by clicking the Clear button. To sort the data click OK.

You can also sort the data set based upon structural similarity to a given compound by right clicking on the header of the row containing your reference compound and choosing Sort by structural similarity (in either Ascending or Descending order).

To find data within a specific selection (or the whole data set if there are no selections) select the Find option on the Data Set menu to display the Find dialogue. Alternatively use the short-cut Ctrl-f.

Merging Data Sets

You can merge two data sets together, creating a new data set with columns from both the current data set and those of a selected set. To do this open both the data sets and, with one of the data sets being the active window (if necessary bring it to the front using the Windows menu), select Merge... from the Data Set menu to opens the Merge dialogue.

In the Merge dialogue, select the other data set to be merged from the drop-down list (assuming the other data set is not already showing). The table will show the columns from the current data set under Column and the equivalent column from the other data set, with which each will be paired, under Merge With. By default, columns with the same name and type will be paired, but the drop-down lists give you the opportunity to review and make changes. To pair any other columns select the appropriate column name from the drop-down list and then click the OK button.

Rows from the two data sets will be considered matches, and will therefore be merged, when all the paired columns have identical values. For each match a single row will be created in the output data set. Where values are missing in some of the rows, these will not be considered when matching and in the final data set the missing values will be replaced where possible.

Calculating New Data Columns

You can use the function editor to create new columns which are functions calculated using the values in other columns. You can open the function editor using the Data Set, Function Editor menu.

Clicking on a function in the list displays a template in the editor with the first item that needs to be filled in already selected and ready for input. To add column names, choose them from the list or type them in. Note: Column names should be enclosed in curly brackets e.g. {logP}. If you wish to compare a value in a column with some text then the text should be enclosed within double quotes.

If you wish to generate categories in the results then the desired category names should be enclosed within single quotes.

Example:
To categorise logP into high and low, simply use the function:
if({logP}>3,’high’,’low’)

Clicking the OK button will close the dialogue and add a new column to the data set which has the name you specified in New Column Name. If there is an error in the function, a message will appear indicating that there is a problem.

Note: If values in columns which are used as part of the function are changed after the function column has been created, the values in the function column will update automatically.

Propagation of Error

When functions are applied, standard deviations and probabilities are calculated for the results based upon those in the input data. Please be aware that error estimates for non-linear functions are therefore only approximations, the accuracies of which are partially dependent on the mathematical function and partially on the scale of the numbers being used.

Tagging Selections

To remember any selected rows click the tag button . This will display a dialogue enabling you to create a new column which contains flags in each cell indicating whether or not that row was selected.

Editing data

To edit one or more rows of data, select them and then right-click on the row label to bring up the menu. To delete, cut or copy the selected rows select the appropriate menu item. To paste data from the clipboard select the Paste menu item. Cut, copy and paste are also available on the Edit menu and via the shortcuts Ctrl-x (cut), Ctrl-c (copy) and Ctrl-v (paste).

To edit one or more columns right-click on the column header to display the menu. To delete the selected columns select the Delete menu item. To insert a new column select the Insert... menu item and then enter a name for the new column in the dialogue. To edit the properties and data of a column select the Edit... menu item. The Edit Column dialogue will be displayed. This dialogue enables you to change different characteristics of the column, including the type (number, molecule, category, date or text).

If you want to edit individual values, double-click the cell and change the values that appear in the dialogue (Note: this is not possible for molecules). If you click OK then the dialogue will change the value and close. If instead you press the Enter key then the change will be made and the dialogue will then select the next row and show you those values so that you can edit them too, enabling you to continue editing values down a column.

Checking for duplicates

You can check and remove rows in your data set that have duplicate information by selecting Check For Duplicates... from the Data Set menu. Firstly, select the columns to be compared. If you select more than one column then the corresponding values in all of the columns must be identical for two rows to be considered duplicates. Once the data set has been analysed, the different sets of duplicate rows will be shown, enabling you to confirm which rows should be kept and removed. The final step of the process enables you to choose whether the duplicate rows should be removed from the actual data set or whether a new data set should be created which just contains the duplicate-free rows.

Filtering data

To filter out data set rows that satisfy one or more criteria, select Filter... from the Data Set menu. The dialogue that appears enables you to define as many filters as you need. Each of the filters contains the name of the column to check, whether to filter by value or confidence, and the actual condition to be met, which will depend on the column type. The option to filter invalid values is also available. The rows that match the combination of filters can either be deleted from the data set or a new data set can be created which just contains the rows that have not been filtered.

Saving & Exporting

After working with a data set you may wish to save your results for future analysis. A number of options are available.

To save the data set as a Sentira file choose Save from the File menu. If the data set has been saved before, this will update the saved version. If the data set has not been saved before, the Save As dialogue will appear. Specify the file name where prompted and click the Save button. The file will be saved with the suffix .skd to indicate that it is a Sentira file. To save the data set as a separate file, select Save As from the File menu. The same dialogue will appear giving you the option to choose a new file name for the data set.

To export an SD File, select Save As... from the File menu, select SD Files from the Save as type drop-down list and specify the file name. The file will be saved with the suffix .sdf to denote an SD file.

To export a CSV File, select Save As... from the File menu, select Comma-Separated Variable Files from the Save as type drop-down list and specify the file name. The file will be saved with the suffix .csv to denote a CSV file.

If you just wish to save the chemical structures you can also export your data set as a SMILES file. However, unless you only need to export the structures and an identifier, exporting a data set in this format is not recommended because no other information will be exported. To export a SMILES file, select Save As... from the File menu, select SMILES Files from the Save as type drop-down list and specify the file name. The file will be saved with the suffix .smi to denote a SMILES file.

Note: If the column immediately to the right of the structure column contains text it will be used as the identifier for the SMILES string.

In this section

Overview

Join the Optibrium community

Join our community to see answers to frequently asked questions or ask your own.

JOIN NOW