Skip to content

[QUESTION]: Handling of multi-word keywords in Scopus query construction #4

@WilmerGaspar

Description

@WilmerGaspar

Your Question

While reviewing the metadata collection workflow, I noticed that multi-word keywords appear to be inserted directly into the Scopus query URL. I have not yet verified whether this causes incorrect behavior in the Scopus API because I currently do not have access to a Scopus API key.

Is this the expected behavior, or should multi-word keywords be URL encoded or quoted before being passed to Scopus?

What I've Already Tried

  • Checked the documentation: Yes
  • Searched existing issues: Yes
  • Installed ComProScanner locally and traced the metadata collection workflow.
  • Followed the query path:

main_fetch() → _fetch_paginated_data() → fetch_and_process_data() → _construct_url()

  • Generated a test URL using the keyword:

thermal conductivity

Generated URL:

https://api.elsevier.com/content/search/scopus?query=PUBYEAR+%3D+2024+thermal conductivity&count=200&cursor=*

I noticed that the space between "thermal" and "conductivity" is preserved in the generated URL and I could not find explicit URL encoding in this code path.

I currently do not have access to a Scopus API key, so I have not yet verified whether this causes incorrect behavior or whether Scopus handles it internally.

Context

I'm trying to understand whether this could be related to the multi-word keyword issue previously discussed in Slack.

My goal is to help validate and improve the metadata collection workflow before testing with Scopus access.

System & Environment

  • OS: Windows 10
  • Processor: Intel Core i5 (adjust if needed)
  • Installed RAM: 8 GB (adjust if needed)
  • GPU Enabled: No
  • Terminal: Windows Command Prompt (CMD)
  • Python environment: Python 3.12.10

Additional context

I can provide additional testing results once I obtain access to a Scopus API key.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workinggood first issueGood for newcomersquestionFurther information is requested

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions