STRING API

STRING has an application programming interface (API) which enables you to get the data without using the graphical user interface of the web page. The API is convenient if you need to programmatically access some information but still do not want to download the entire dataset. There are several scenarios when it is practical to use it. For example, you might need to access some interaction from your own scripts or want to incorporate STRING network in your web page.

We currently provide an implementation using HTTP, where the database information is accessed by HTTP requests. Due to implementation reasons, similarly to the web site, some API methods will allow only a limited number of proteins in each query. If you need access to the bulk data, you can download the entire dataset from the download page

There are several methods available through STRING API:

Mrethod API method URL Description
Mapping identifiers /api/tsv/get_string_ids? Maps common protein names, synonyms and UniProt identifiers into STRING identifiers
Getting the network image /api/image/network? Retrieves the network image with your input protein(s) highlighted in color
Embedding interactive network in the website javascript:getSTRING(...) A call that lets you embed a network with movable bubbles and protein information pop-ups in a website.
Retrieving the interaction network /api/tsv/network? Retrieves the network interactions for your input protein(s) in various text based formats
Getting the interaction partners /api/tsv/interaction_partners? Gets all the STRING interaction partners of your proteins
Getting protein similarity scores /api/tsv/homology? Retrieve the protein similarity scores between the input proteins
Retrieving best similarity hits between species /api/tsv/homology_best? Retrieve the best (highest) protein similarity hit between different species in your input.
Performing functional enrichment /api/tsv/enrichment? Performs the enrichment analysis of your set of proteins for the Gene Ontology, KEGG pathways, UniProt Keywords, PubMed publications, Pfam, InterPro and SMART domains.
Retrieving functional annotation /api/tsv/functional_annotation? Gets the functional annotation (Gene Ontology, UniProt Keywords, PFAM, INTERPRO and SMART domains) of your list of proteins.
Performing interaction enrichment /api/tsv/ppi_enrichment? Tests if your network has more interactions than expected
Getting current STRING version /api/tsv/version Prints the current STRING version and its stable address

Getting started Top ↑

As STRING API works like normal HTTP request you can access it like any other webpage. Just copy/paste the following URL into your browser to get the PNG image of Patched 1 gene network.

https://string-db.org/api/image/network?identifiers=PTCH1

However most likely you will be accessing the API from your scripts or website. Examples of python scripts for each of the API calls are attached at the end of each section.

In order to query with more than one identifier in one call just separate each identifier by carriage return character "\r" or "%0d" (depending how you call STRING):

https://string-db.org/api/image/network?identifiers=PTCH1%0dSHH%0dGLI1%0dSMO%0dGLI3

or from python3:

import requests ## python -m pip install requests
response = requests.get("https://string-db.org/api/image/network?identifiers=PTCH1%0dSHH%0dGLI1%0dSMO%0dGLI3")
with open('string_network.png', 'wb') as fh:
    fh.write(response.content)

...but before you get started:

  1. Please be considerate and wait one second between each call, so that our server won't get overloaded.
  2. Although STRING understands a variety of identifiers and does its best to disambiguate your input it's always better to map them first (see: mapping).
  3. If known, please always specify from which species your proteins come from. If you do, each of your queries will be answered faster :)
  4. When calling our API from your website or tools please identify yourself using the caller_identity parameter.
  5. STRING understands both GET and POST requests. GET requests, although simpler to use, have a character limit, therefore it is recommended to use POST whenever possible.

Mapping identifiers Top ↑

You can call our STRING API with common gene names, various synonyms or even UniProt identifiers and accession numbers. However, STRING may not always understand them which may lead to errors or inconsistencies. Before using other API methods it is always advantageous to map your identifiers to the ones STRING uses. In addition, STRING will resolve its own identifiers faster, therefore your tool/website will see a speed benefit if you use them. For each input protein STRING places the best matching identifier in the first row, so the first line will usually be the correct one.

Call:

https://string-db.org/api/[output-format]/get_string_ids?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
echo_query insert column with your input identifier (takes values '0' or '1', default is '0')
limit limits the number of matches per query identifier (best matches come first)
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
caller_identity your identifier for us.

Output fields:

Field Description
queryItem (OPTIONAL) your input protein
queryIndex position of the protein in your input (starting from position 0)
stringId STRING identifier
ncbiTaxonId NCBI taxon identifier
taxonName species name
preferredName common protein name
annotation protein annotation

Example call (resolving "p53" and "cdk2" in human):

https://string-db.org/api/tsv/get_string_ids?identifiers=p53%0dcdk2&species=9606

Example of python3 code:

#!/usr/bin/env python3

##########################################################
## For a given list of proteins the script resolves them
## (if possible) to the best matching STRING identifier
## and prints out the mapping on screen in the TSV format
##
## Requires requests module:
## type "python -m pip install requests" in command line
## (win) or terminal (mac/linux) to install the module
###########################################################

import requests ## python -m pip install requests

string_api_url = "https://string-db.org/api"
output_format = "tsv-no-header"
method = "get_string_ids"

##
## Set parameters
##

params = {

    "identifiers" : "\r".join(["p53", "BRCA1", "cdk2", "Q99835"]), # your protein list
    "species" : 9606, # species NCBI identifier 
    "limit" : 1, # only one (best) identifier per input protein
    "echo_query" : 1, # see your input identifiers in the output
    "caller_identity" : "www.awesome_app.org" # your app name

}

##
## Construct URL
##


request_url = "/".join([string_api_url, output_format, method])

##
## Call STRING
##

results = requests.post(request_url, data=params)

##
## Read and parse the results
##

for line in results.text.strip().split("\n"):
    l = line.split("\t")
    input_identifier, string_identifier = l[0], l[2]
    print("Input:", input_identifier, "STRING:", string_identifier, sep="\t")

Getting STRING network image Top ↑

With our API you can retrieve an image of a STRING network of a neighborhood surrounding one or more proteins or ask STRING to show only the network of interactions between your input proteins. All the network flavors (confidence/evidence/action) are accessible through the API. The API can output the image as a PNG (low and high resolution with alpha-channel) or as an SVG (vector graphics that can be modified through scripts or in an appropriate software).

Call:

https://string-db.org/api/[output-format]/network?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
image network PNG image with alpha-channel
highres_image high resolution network PNG image with alpha-channel
svg vector graphic format (SVG)

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
add_color_nodes adds color nodes based on scores to the input proteins
add_white_nodes adds white nodes based on scores to the input proteins (added after color nodes)
required_score threshold of significance to include an interaction, a number between 0 and 1000 (default depends on the network)
network_flavor the style of edges in the network: evidence, confidence (default), actions
hide_node_labels hides all protein names from the picture (defailt:0)
hide_disconnected_nodes hides all proteins that are not connected to any other protein in your network (default:0)
block_structure_pics_in_bubbles disables structure pictures inside the bubble (default:0)
caller_identity your identifier for us.

If you query the API with one protein the "add_white_nodes" parameter is automatically set to 10, so you can see the interaction neighborhood of your query protein. However, similarly to the STRING webpage, whenever you query the API with more than one protein we show only the interactions between your input proteins. You can, of course, always extend the interaction neighborhood by setting "add_color/white_nodes" parameter to the desired value.

Output (EGFR and TP53 actions neighborhood):

EGFR and TP53 interaction neighborhood

Example call (EGFR and TP53 interaction neighborhood):

https://string-db.org/api/image/network?identifiers=TP53%0dEGFR&add_white_nodes=10&network_flavor=actions

Example python3 code:

#!/usr/bin/env python3

################################################################
## For each protein in a list save the PNG image of
## STRING network of its 15 most confident interaction partners.
##
## Requires requests module:
## type "python -m pip install requests" in command line (win)
## or terminal (mac/linux) to install the module
################################################################

import requests ## python -m pip install requests
from time import sleep

string_api_url = "https://string-db.org/api"
output_format = "image"
method = "network"

my_genes = ["YMR055C", "YFR028C",
            "YNL161W", "YOR373W",
            "YFL009W", "YBR202W"]


##
## Construct URL
##


request_url = "/".join([string_api_url, output_format, method])

## For each gene call STRING

for gene in my_genes:

    ##
    ## Set parameters
    ##

    params = {

        "identifiers" : gene, # your protein
        "species" : 4932, # species NCBI identifier 
        "add_white_nodes": 15, # add 15 white nodes to my protein 
        "network_flavor": "confidence", # show confidence links
        "caller_identity" : "www.awesome_app.org" # your app name

    }


    ##
    ## Call STRING
    ##

    response = requests.post(request_url, data=params)

    ##
    ## Save the network to file
    ##

    file_name = "%s_network.png" % gene
    print("Saving interaction network to %s" % file_name)

    with open(file_name, 'wb') as fh:
        fh.write(response.content)

    sleep(1)

Embedding the interactive network Top ↑

Using provided simple HTML and JavaScript code you can embed the interactive STRING network in your website or webApp. The advantages of the interactive network are several: 1) The user can move the nodes around to explore the network's structure 2) each node can be clicked and a pop-up will appear above the network providing information about the protein's structure, its function and sequence 3) the network contains a link-out to STRING that lets the user jump directly to STRING website and explore interactions in even more details. The behaviour of the API is similar to the Getting the network image API, that is, it lets you retrieve either 1) the image of the interaction neighborhood of one or more proteins and 2) the image of the interaction network only between the specified set of proteins. The two APIs also shares the exact same set of parametes.

Code:

First your website has to load two elements to make it work. IMPORTANT: Do not load these elements in the iFrame - if you do the pop-ups will be confiened only to the iFrame instead of overlaying your website.

The first element is a small STRING's javascript library that allows for moveable bubbles and pop-ups.

<script type="text/javascript" src="http://string-db.org/javascript/combined_embedded_network_v2.0.2.js"></script>

The second is an HTML DIV element. You insert in the place where you want the network to displayed. You can assign any class to this element or style it however you want, just do not change its identifier.

<div id="stringEmbedded"></div>

And finally this API call in Javascript:

getSTRING('https://string-db.org', {[dictionary of parameters]})">

Available parameters:

Parameter Description
identifiers required parameter - array of protein names e.g. ['TP53', 'CDK2']
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
add_color_nodes adds color nodes based on scores to the input proteins
add_white_nodes adds white nodes based on scores to the input proteins (added after color nodes)
required_score threshold of significance to include an interaction, a number between 0 and 1000 (default depends on the network)
network_flavor the style of edges in the network: evidence, confidence (default), actions
hide_node_labels hides all protein names from the picture (defailt:0)
hide_disconnected_nodes hides all proteins that are not connected to any other protein in your network (default:0)
block_structure_pics_in_bubbles disables structure pictures inside the bubble (default:0)
caller_identity your identifier for us.

Example JavaScript call (TP53 interaction neighborhood):

getSTRING('https://string-db.org', {'species':'9606', 'identifiers':['TP53'], 'network_flavor':'confidence'})">

Example website code (to test it copy paste it into an empty file with .html extenstion and open it in your browser):

<!DOCTYPE html>
<html>
  <head>

      <!-- Embed the STRING's javascript -->

      <script type="text/javascript" src="https://string-db.org/javascript/combined_embedded_network_v2.0.2.js"></script>

      <style>

           /* some styling */

           body {
               font-family: Arial;
               color: #122e4d;
               background: #cedded;
           }

           input {
               border: 4px solid #122e4d; 
               width: 40%;
               height: 50px;
               border-radius: 10px;  
               font-size: 20px;
           }

           button {
               font-size:20px;
               border-radius: 10px;  
               border: 1px solid #122e4d; 
           }

      </style>
      <script>

          function send_request_to_string() {

              var inputField = document.getElementById('inputarea');

              var text = inputField.value;

              if (!text) {text = inputField.placeholder}; // placeholder

              var proteins = text.split(' ');

              /* the actual API query */

              getSTRING('https://string-db.org', {
                          'species':'9606',
                          'identifiers':proteins,
                          'network_flavor':'confidence', 
                          'caller_identity': 'www.awesome_app.org'
              })

          }

      </script>
  </head>

  <!-- HTML CODE -->

  <body onload='javascript:send_request_to_string()'>
      <center>
          <h1>THE BEST WEBSITE</h1>
          <h3>Query me: (one human protein or multiple space seperated proteins)</h3>
             <input type="text" id='inputarea' placeholder='GLI3'><br/></br/>
             <button onclick='javascript:send_request_to_string();' type="button">Let's go!</button>
             <h3>Network:</h3>
             <div id="stringEmbedded"></div>
      </center>
  </body>
</html>

Getting the STRING network interactions Top ↑

The network API method also allows you to retrieve your STRING interaction network for one or multiple proteins in various text formats. It will tell you the combined score and all the channel specific scores for the set of proteins. You can also extend the network neighborhood by setting "add_nodes", which will add, to your network, new interaction partners in order of their confidence.

Call:

https://string-db.org/api/[output-format]/network?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format
psi-mi PSI-MI XML format
psi-mi-tab PSI-MITAB format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
required_score threshold of significance to include a interaction, a number between 0 and 1000 (default depends on the network)
add_nodes adds a number of proteins with to the network based on their confidence score
caller_identity your identifier for us.

If you query the API with one protein the "add_nodes" parameter is automatically set to 10, so you can can get the interaction neighborhood of your query protein. However, similarly to the STRING webpage, whenever you query the API with more than one protein the method will output only the interactions between your input proteins. You can, of course, always extend the interaction neighborhood by setting "add_nodes" parameter to the desired value.

Output fields (TSV and JSON formats):

Field Description
stringId_A STRING identifier (protein A)
stringId_B STRING identifier (protein B)
preferredName_A common protein name (protein A)
preferredName_B common protein name (protein B)
ncbiTaxonId NCBI taxon identifier
score combined score
nscore gene neighborhood score
fscore gene fusion score
pscore phylogenetic profile score
ascore coexpression score
escore experimental score
dscore database score
tscore textmining score

To see how the combined score is computed from the partial scores see FAQ

Example call (retrieve all interactions between TP53 EGFR and CDK2):

https://string-db.org/api/tsv/network?identifiers=TP53%0dEGFR%0dCDK2&required_score=400

Example python3 code:

#!/usr/bin/env python3

##################################################################
## For the given list of proteins print out only the interactions
## between these protein which have medium or higher confidence
## experimental score
##
## Requires requests module:
## type "python -m pip install requests" in command line (win)
## or terminal (mac/linux) to install the module
##################################################################

import requests ## python -m pip install requests


string_api_url = "https://string-db.org/api"
output_format = "tsv-no-header"
method = "network"

##
## Construct URL
##

request_url = "/".join([string_api_url, output_format, method])

##
## Set parameters
##

my_genes = ["CDC42","CDK1","KIF23","PLK1",
            "RAC2","RACGAP1","RHOA","RHOB"]

params = {

    "identifiers" : "%0d".join(my_genes), # your protein
    "species" : 9606, # species NCBI identifier 
    "caller_identity" : "www.awesome_app.org" # your app name

}

##
## Call STRING
##

response = requests.post(request_url, data=params)

for line in response.text.strip().split("\n"):

    l = line.strip().split("\t")
    p1, p2 = l[2], l[3]

    ## filter the interaction according to experimental score
    experimental_score = float(l[10])
    if experimental_score > 0.4:
        ## print 
        print("\t".join([p1, p2, "experimentally confirmed (prob. %.3f)" % experimental_score]))

Getting all the STRING interaction partners of the protein set Top ↑

Diffrently from the network API method, which retrieves only the interactions between the set of input proteins and between their closest interaction neighborhood (if add_nodes parameters is specified), interaction_partners API method provides the interactions between your set of proteins and all the other STRING proteins. The output is available in various text based formats. As STRING network usually has a lot of low scoring interactions, you may want to limit the number of retrieved interaction per protein using "limit" parameter (of course the high scoring interactions will come first).

Call:

https://string-db.org/api/[output-format]/interaction_partners?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format
psi-mi PSI-MI XML format
psi-mi-tab PSI-MITAB format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
limit limits the number of interaction partners retrieved per protein (most confident interactions come first)
required_score threshold of significance to include a interaction, a number between 0 and 1000 (default depends on the network)
caller_identity your identifier for us.

Output fields (TSV and JSON formats):

Field Description
stringId_A STRING identifier (protein A)
stringId_B STRING identifier (protein B)
preferredName_A common protein name (protein A)
preferredName_B common protein name (protein B)
ncbiTaxonId NCBI taxon identifier
score combined score
nscore gene neighborhood score
fscore gene fusion score
pscore phylogenetic profile score
ascore coexpression score
escore experimental score
dscore database score
tscore textmining score

To see how the combined score is computed from the partial scores see FAQ

Example call (retrieve best 10 STRING interactions for TP53 and CDK2):

https://string-db.org/api/tsv/interaction_partners?identifiers=TP53%0dCDK2&limit=10

Example python3 code:

#!/usr/bin/env python3

################################################################
## For each protein in the given list print the names of
## their 5 best interaction partners.
##
## Requires requests module:
## type "python -m pip install requests" in command line (win)
## or terminal (mac/linux) to install the module
################################################################


import requests ## python -m pip install requests

string_api_url = "https://string-db.org/api"
output_format = "tsv-no-header"
method = "interaction_partners"

my_genes = ["9606.ENSP00000000233", "9606.ENSP00000000412",
            "9606.ENSP00000000442", "9606.ENSP00000001008"]

##
## Construct the request
##

request_url = "/".join([string_api_url, output_format, method])

##
## Set parameters
##

params = {

    "identifiers" : "%0d".join(my_genes), # your protein
    "species" : 9606, # species NCBI identifier 
    "limit" : 5,
    "caller_identity" : "www.awesome_app.org" # your app name

}


##
## Call STRING
##

response = requests.post(request_url, data=params)

##
## Read and parse the results
##

for line in response.text.strip().split("\n"):

    l = line.strip().split("\t")
    query_ensp = l[0]
    query_name = l[2]
    partner_ensp = l[1]
    partner_name = l[3]
    combined_score = l[5]

    ## print

    print("\t".join([query_ensp, query_name, partner_name, combined_score]))

Retrieving similarity scores of the protein set Top ↑

STRING internally uses the Smith–Waterman bit scores as a proxy for protein homology. The original scores are computed by SIMILARITY MATRIX OF PROTEINS (SIMAP) project. Using this API you can retrieve these scores between the proteins in a selected species. They are symmetric, therefore to make the transfer a bit faster we will send only half of the similarity matrix (A->B, but not symmetric B->A) and the self-hits. The bit score cut-off below which we do not store or report homology is 50.

Call:

https://string-db.org/api/[output-format]/homology?identifiers=[your_identifiers]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
caller_identity your identifier for us.

Output fields (TSV and JSON formats):

Field Description
ncbiTaxonId_A NCBI taxon identifier (protein A)
stringId_A STRING identifier (protein A)
ncbiTaxonId_B NCBI taxon identifier (protein B)
stringId_B STRING identifier (protein B)
bitscore Smith-Waterman alignment bit score

Example call (retrieve homology scores between CDK1 and CDK2)

https://string-db.org/api/tsv/homology?identifiers=CDK1%0dCDK2

Retrieving best similarity hits between species Top ↑

STRING internally uses the Smith–Waterman bit scores as a proxy for protein homology. The original scores are computed by SIMILARITY MATRIX OF PROTEINS (SIMAP) project. Using this API you can retrieve these similarity scores between your input proteins and proteins in all of the STRING's organisms. Only the best hit per organism for each protein will be retrieved.

There are many organisms in STRING, so expect thousands hits for each protein, however you can filter the results to the list of organisms of interest by using 'species_b' parameter.

Call:

https://string-db.org/api/[output-format]/homology_best?identifiers=[your_identifiers]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
species_b a list of NCBI taxon identifiers seperated by "%0d" (e.g. human, fly and yeast would be "9606%0d7227%0d4932" see: STRING organisms).
caller_identity your identifier for us.

Output fields (TSV and JSON formats):

Field Description
ncbiTaxonId_A NCBI taxon identifier (protein A)
stringId_A STRING identifier (protein A)
ncbiTaxonId_B NCBI taxon identifier (protein B)
stringId_B STRING identifier (protein B)
bitscore Smith-Waterman alignment bit score

Example call (retrieve best homology score between human CDK1 and its closest mouse homolog)

https://string-db.org/api/tsv/homology_best?identifiers=CDK1&species_b=10090

Getting functional enrichment Top ↑

STRING maps several databases onto its proteins, this includes: Gene Ontology, KEGG pathways, UniProt Keywords, PubMed publications, Pfam domains, InterPro domains, and SMART domains.The STRING enrichment API method allows you to retrieve functional enrichment for any set of input proteins. It will tell you which of your input proteins have an enriched term and the term's description. The API provides the raw p-values, as well as, False Discovery Rate and Bonferroni corrected p-values. The detailed description of the enrichment algorithm can be found here

Call:

https://string-db.org/api/[output_format]/enrichment?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
background_string_identifiers using this parameter you can specify the background proteome of your experiment. Only STRING identifiers will be recognised (each must be seperated by "%0d") e.g. '7227.FBpp0077451%0d7227.FBpp0074373'. You can map STRING identifiers using mapping identifiers method.
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
caller_identity your identifier for us.

Output fields:

Field Description
category term category (e.g. GO Process, KEGG pathways)
term enriched term (GO term, domain or pathway)
number_of_genes number of genes in your input list with the term assigned
number_of_genes_in_background total number of genes in the background proteome with the term assigned
ncbiTaxonId NCBI taxon identifier
inputGenes gene names from your input
preferredNames common protein names (in the same order as your input Genes)
p_value raw p-value
fdr False Discovery Rate
description description of the enriched term

STRING shows only the terms with the raw p-value below 0.1

Example call (the network neighborhood of Epidermal growth factor receptor):

https://string-db.org/api/tsv/enrichment?identifiers=trpA%0dtrpB%0dtrpC%0dtrpE%0dtrpGD

Example python3 code:

#!/usr/bin/env python3

##############################################################
## The following script retrieves and prints out
## significantly enriched (FDR < 1%) GO Processes
## for the given set of proteins. 
##
## Requires requests module:
## type "python -m pip install requests" in command line (win)
## or terminal (mac/linux) to install the module
##############################################################

import requests ## python -m pip install requests 
import json

string_api_url = "https://string-db.org/api"
output_format = "json"
method = "enrichment"


##
## Construct the request
##

request_url = "/".join([string_api_url, output_format, method])

##
## Set parameters
##

my_genes = ['7227.FBpp0074373', '7227.FBpp0077451', '7227.FBpp0077788',
            '7227.FBpp0078993', '7227.FBpp0079060', '7227.FBpp0079448']

params = {

    "identifiers" : "%0d".join(my_genes), # your protein
    "species" : 7227, # species NCBI identifier 
    "caller_identity" : "www.awesome_app.org" # your app name

}

##
## Call STRING
##

response = requests.post(request_url, data=params)

##
## Read and parse the results
##

data = json.loads(response.text)

for row in data:

    term = row["term"]
    preferred_names = ",".join(row["preferredNames"])
    fdr = float(row["fdr"])
    description = row["description"]
    category = row["category"]

    if category == "Process" and fdr < 0.01:

        ## print significant GO Process annotations

        print("\t".join([term, preferred_names, str(fdr), description]))

Retrieving functional annotation Top ↑

STRING maps several databases onto its proteins, this includes: Gene Ontology, KEGG pathways, UniProt Keywords, PubMed publications, Pfam domains, InterPro domains, and SMART domains. You can retrieve all these annotations (and not only enriched subset) for your proteins via this API. Due to the potential large size of the PubMed (Reference Publications) assignments, they won't be sent by default, but you can turn them back on by specifying the 'allow_pubmed=1' parameter.

Please note: KEGG annotations are not available due to KEGG licence restrictions.

Call:

https://string-db.org/api/[output_format]/functional_annotation?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
allow_pubmed '1' to print also the PubMed annotations, default is '0'
caller_identity your identifier for us.

Output fields:

Field Description
category term category (e.g. GO Process, KEGG pathways)
term enriched term (GO term, domain or pathway)
number_of_genes number of genes in your input list with the term assigned
ratio_in_set ratio of the proteins in your input list with the term assigned
ncbiTaxonId NCBI taxon identifier
inputGenes gene names from your input
preferredNames common protein names (in the same order as your input Genes)
description description of the enriched term

Example call (the network neighborhood of Epidermal growth factor receptor):

https://string-db.org/api/tsv/functional_annotation?identifiers=cdk1

Getting protein-protein interaction enrichment Top ↑

Even in the absence of annotated proteins (e.g. in novel genomes) STRING can tell you if your subset of proteins is functionally related, that is, if it is enriched in interactions in comparison to the background proteome-wide interaction distribution. The detailed description of the PPI enrichment method can be found here

Call:

https://string-db.org/api/[output_format]/ppi_enrichment?identifiers=[your_identifiers]&[optional_parameters]

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

Parameter Description
identifiers required parameter for multiple items, e.g. DRD1_HUMAN%0dDRD2_HUMAN
species NCBI taxon identifiers (e.g. Human is 9606, see: STRING organisms).
required_score threshold of significance to include a interaction, a number between 0 and 1000 (default depends on the network)
caller_identity your identifier for us.

Output fields:

Field Description
number_of_nodes number of proteins in your network
number_of_edges number of edges in your network
average_node_degree mean degree of the node in your network
local_clustering_coefficient average local clustering coefficient
expected_number_of_edges expected number of edges based on the nodes degrees
p_value significance of your network having more interactions than expected

Example call (the network neighborhood of Epidermal growth factor receptor):

https://string-db.org/api/tsv/ppi_enrichment?identifiers=trpA%0dtrpB%0dtrpC%0dtrpE%0dtrpGD

Example python3 code:

#!/usr/bin/env python

##############################################################
## The script prints out the p-value of STRING protein-protein
## interaction enrichment method for the given set of proteins 
##
## Requires requests module:
## type "python -m pip install requests" in command line (win)
## or terminal (mac/linux) to install the module
##############################################################

import requests ## python -m pip install requests

string_api_url = "https://string-db.org/api"
output_format = "tsv-no-header"
method = "ppi_enrichment"

##
## Construct the request
##

request_url = "/".join([string_api_url, output_format, method])

##
## Set parameters
##

my_genes = ['7227.FBpp0074373', '7227.FBpp0077451', '7227.FBpp0077788',
            '7227.FBpp0078993', '7227.FBpp0079060', '7227.FBpp0079448']

params = {

    "identifiers" : "%0d".join(my_genes), # your proteins
    "species" : 7227, # species NCBI identifier 
    "caller_identity" : "www.awesome_app.org" # your app name

}

##
## Call STRING
##

response = requests.post(request_url, data=params)

##
## Parse and print the respons Parse and print the responsee
##

for line in response.text.strip().split("\n"):
    pvalue = line.split("\t")[5]
    print("P-value:", pvalue)

Getting current STRING version Top ↑

STRING is not updated, on average, every two years. Maybe you would like to download the latest version of STRING's genomes, or simply you would like to be informed if the new version is released. For this you can use the version API, which tell you the latest version of STRING and its stable address. The 'stable address' is an address that does not change beteween version, so if you use the stable API address it will always produce the results from that particular version.

Call:

https://string-db.org/api/[output_format]/version

Available output formats:

Format Description
tsv tab separated values, with a header line
tsv-no-header tab separated values, without header line
json JSON format
xml XML format

Available parameters:

None

Output fields:

Field Description
string_version current string version
string_stable_address STRING stable url

Example call:

https://string-db.org/api/json/version

Additional information about the API Top ↑

If you need to do a large-scale analysis, please download the full data set. Otherwise you may end up fooding the STRING server with API requests. In particular, try to avoid running scripts in parallel :)

Please contact us if you have any questions regarding the API.

API for retrieving abstracts and actions is temporary undocumented but will return soon in a revised form.