Projecting External Payload Information onto STRING networks

Advanced STRING users can choose to project their own data (on proteins or on interactions between proteins) onto the name space and the networks of STRING. This has the advantage of being able to browse and visualize external data on the STRING website, helping to place it into biological context.

In practice, external sites can add multiple kinds of information, independently of each other:

  • Mark proteins of interest with a colored halo (and show an image as legend)
  • Add additional information to the STRING protein popup windows (by embedding any external site in an HTML iFrame).
  • Define additional connections between proteins (edges), and show edge-related information in the edge popup.

The implementation is done via a simple call-back strategy: users wishing to project payloads must set up a web server on their site (either to provide static files, or to dynamically provide additional information; see below). STRING will then call back to this server in order to get the appropriate information at runtime.

The address of this call-back server must be provided to STRING as an additional parameter on the STRING URL, it will then be remembered for the duration of the session. External data will be kept confidential: no other STRING user will see this data unless they know the additional URL parameter; more importantly, all the iFrame contents are directly requested by the user’s web browser and can thus be password-protected or placed under other restrictions (for example, intranet-only access).

Implementation Details Top ↑

There are two basic strategies: a static flatfile implementation, and a web service implementation. The first implementation is best suited for smaller amounts of data, which do not change frequently. These data are loaded once, and cached at the STRING site. The second implementation is for larger datasets that may also be subject to more frequent updates. Here, STRING will only request the items that are actually being browsed; these are not cached. Note that in the latter case, user experience depends critically on the stability and speed of the web service for STRING to call. For both strategies, the format of the actual data being passed is identical.

Static Flatfile Implementation Top ↑

(a) Provide one additional parameter on the URL when calling STRING:

http://string-db.org/newstring_cgi/show_network_section.pl?identifier=9606.ENSP00000327077&external_payload_URL=http://your-server.org/ext.json

Here, STRING downloads all the necessary data the first time a certain external payload is specified and caches the information locally. To force STRING to download the data again, you can add a refresh_payload=1 parameter.

(b) STRING will request the file named by this parameter; it should be a configuration file pointing STRING to further files that it can then download and cache. Its content should be as follows:

{"nodes_file":"http://your-server.org/nodes.txt",
"edges_file":"http://your-server.org/edges.txt",
"logo_file":"http://your-server.org/banner.jpg",
"legend_file":"http://your-your-server.org/colorlegend.jpg",
"name":"Your_DB_Name"}

(c) The various files above contain the payload data. Importantly, that data must already be mapped onto the name space of STRING, prior to be used in the callback – precisely matching the STRING version that is being called. This means that identifiers should be in the format 9606.ENSP00000327077.

The data is expected as follows (all fields are tab-separated):

Nodes file:

STRING_id    color_(hex_#FF0000)    comment_URL    URL_for_iFrame_content

Color, comment and URLs are optional. If you do not want to provide either of them, the field should be left empty. The color will be used to paint a "halo" around the node. The other pieces of information are shown on the node popup. The comment will be shown as plain text (no HTML is allowed in this field). If a link URL is provided, a “more info” link will be displayed, linking to the specified URL.

Edges file:

STRING_id_1    STRING_id_2    evidence_type    interaction_score    comment_URL    URL_for_iFrame_content

The evidence type can either be one of the existing evidence types (see below) or a new, plain-text evidence type. If a new evidence type is specified, orange edges will be drawn between the nodes. The interaction score is a number between 0 and 1. The comment will be shown as plain text, no HTML is allowed. If a link URL is provided, a show button will be displayed, linking to the specified URL.

Existing evidence channels: neighborhood, fusion, cooccurence, array, experimental, database, textmining.

Web Service Implementation Top ↑

Here, the same configuration file syntax is used, but STRING should be provided with two web-service addresses, instead of the nodes and edges-files:

{"nodes_webservice":"http://your-server.org/webservice_cgi/handler_for_node_queries",
"edges_webservice":"http://your-server.org/webservice_cgi/handler_for_edge_queries",
"logo_file":"http://your-server.org/banner.jpg",
"legend_file":"http://your-server.org/legend.jpg",
"name":"Your_DB_Name"}

The web-service will respond exactly in the same format as for the static flatfile case. STRING will use two query parameters: First, the identifiers parameter. Multiple identifiers are separated with a blank character (URL-encoded as %20). Second, an only_internal parameter may be specified for the edges web-service. By default, STRING is querying for all edges that emanate from the query protein(s). In contrast, when only_internal=1 is specified, it should return all edges that extend among these proteins to reduce the amount of data transferred.

Note that STRING will have to call your web service at least two times per user request (first to find the interaction partners of the input proteins, then to find all internal edges). Therefore, it is crucial that you ensure a rapid response time for a good user experience. STRING does not cache the responses.