sciencescraper.sciencedirect.scidir_scrape
#
Functions for retrieving the raw text of ScienceDirect articles.
Module Contents#
Functions#
|
Get the full text of a ScienceDirect article using the ScienceDirect API. |
|
Get the full text of a ScienceDirect article using the ScienceDirect API. |
|
Get the raw XML text from an article using the ScienceDirect API and the article's DOI. |
|
Get the raw XML text from an article using the ScienceDirect API and the article's PII. |
|
Get the raw XML text from an article using the ScienceDirect API and the article's URL. |
- sciencescraper.sciencedirect.scidir_scrape.get_article_info(api_key, doi=None, pii=None, url=None, chunk_size=None)[source]#
Get the full text of a ScienceDirect article using the ScienceDirect API.
- Parameters:
api_key (str) – The API key for the ScienceDirect API. API keys can be obtained by creating an account at https://dev.elsevier.com/.
doi (str, optional) – The DOI of the article to be scraped.
pii (str, optional) – The PII of the article to be scraped.
url (str, optional) – The URL of the article to be scraped.
chunk_size (int, optional) – The size of the chunks to split the full text into. Default is None.
- Returns:
A dictionary containing the title, authors, journal, year, URL, open access status, keywords, abstract, methods, results, discussion, and references of the article.
- Return type:
dict
- sciencescraper.sciencedirect.scidir_scrape.get_full_text(api_key, doi=None, pii=None, url=None, chunk_size=None)[source]#
Get the full text of a ScienceDirect article using the ScienceDirect API.
- Parameters:
api_key (str) – The API key for the ScienceDirect API. API keys can be obtained by creating an account at https://dev.elsevier.com/.
doi (str, optional) – The DOI of the article to be scraped.
pii (str, optional) – The PII of the article to be scraped.
url (str, optional) – The URL of the article to be scraped.
chunk_size (int, optional) – The size of the chunks to split the full text into. Default is None.
- Returns:
The full text of the article.
- Return type:
str
- sciencescraper.sciencedirect.scidir_scrape.get_xml_doi(api_key, doi)[source]#
Get the raw XML text from an article using the ScienceDirect API and the article’s DOI.
- Parameters:
api_key (str) – The API key for the ScienceDirect API. API keys can be obtained by creating an account at https://dev.elsevier.com/.
doi (str) – The DOI of the article to be scraped.
- Returns:
The raw XML text of the article.
- Return type:
str
- Raises:
requests.exceptions.HTTPError – If the request to the ScienceDirect API fails.
- sciencescraper.sciencedirect.scidir_scrape.get_xml_pii(api_key, pii)[source]#
Get the raw XML text from an article using the ScienceDirect API and the article’s PII.
- Parameters:
api_key (str) – The API key for the ScienceDirect API. API keys can be obtained by creating an account at https://dev.elsevier.com/.
pii (str) – The PII of the article to be scraped.
- Returns:
The raw XML text of the article.
- Return type:
str
- Raises:
requests.exceptions.HTTPError – If the request to the ScienceDirect API fails.
- sciencescraper.sciencedirect.scidir_scrape.get_xml_url(api_key, url)[source]#
Get the raw XML text from an article using the ScienceDirect API and the article’s URL.
- Parameters:
api_key (str) – The API key for the ScienceDirect API. API keys can be obtained by creating an account at https://dev.elsevier.com/.
url (str) – The URL of the article to be scraped.
- Returns:
The raw XML text of the article.
- Return type:
str
- Raises:
requests.exceptions.HTTPError – If the request to the ScienceDirect API fails.