Your Question
While reviewing the metadata collection workflow, I noticed that multi-word keywords appear to be inserted directly into the Scopus query URL. I have not yet verified whether this causes incorrect behavior in the Scopus API because I currently do not have access to a Scopus API key.
Is this the expected behavior, or should multi-word keywords be URL encoded or quoted before being passed to Scopus?
What I've Already Tried
- Checked the documentation: Yes
- Searched existing issues: Yes
- Installed ComProScanner locally and traced the metadata collection workflow.
- Followed the query path:
main_fetch() → _fetch_paginated_data() → fetch_and_process_data() → _construct_url()
- Generated a test URL using the keyword:
thermal conductivity
Generated URL:
https://api.elsevier.com/content/search/scopus?query=PUBYEAR+%3D+2024+thermal conductivity&count=200&cursor=*
I noticed that the space between "thermal" and "conductivity" is preserved in the generated URL and I could not find explicit URL encoding in this code path.
I currently do not have access to a Scopus API key, so I have not yet verified whether this causes incorrect behavior or whether Scopus handles it internally.
Context
I'm trying to understand whether this could be related to the multi-word keyword issue previously discussed in Slack.
My goal is to help validate and improve the metadata collection workflow before testing with Scopus access.
System & Environment
- OS: Windows 10
- Processor: Intel Core i5 (adjust if needed)
- Installed RAM: 8 GB (adjust if needed)
- GPU Enabled: No
- Terminal: Windows Command Prompt (CMD)
- Python environment: Python 3.12.10
Additional context
I can provide additional testing results once I obtain access to a Scopus API key.
Your Question
While reviewing the metadata collection workflow, I noticed that multi-word keywords appear to be inserted directly into the Scopus query URL. I have not yet verified whether this causes incorrect behavior in the Scopus API because I currently do not have access to a Scopus API key.
Is this the expected behavior, or should multi-word keywords be URL encoded or quoted before being passed to Scopus?
What I've Already Tried
main_fetch() → _fetch_paginated_data() → fetch_and_process_data() → _construct_url()thermal conductivityGenerated URL:
https://api.elsevier.com/content/search/scopus?query=PUBYEAR+%3D+2024+thermal conductivity&count=200&cursor=*I noticed that the space between "thermal" and "conductivity" is preserved in the generated URL and I could not find explicit URL encoding in this code path.
I currently do not have access to a Scopus API key, so I have not yet verified whether this causes incorrect behavior or whether Scopus handles it internally.
Context
I'm trying to understand whether this could be related to the multi-word keyword issue previously discussed in Slack.
My goal is to help validate and improve the metadata collection workflow before testing with Scopus access.
System & Environment
Additional context
I can provide additional testing results once I obtain access to a Scopus API key.