-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Gitlab Source: Backoff from Scan2 which is experimental to legacy pagination API call #4608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit rewrite simplifiedGitlabEnumeration to use legacy pagination API call with keyset pagination instead of Scan2 which is currently in experimental state. Note that this doesn't promise to fix this problem it's just a test to check. It also adds a retry logic in case any 500 error occurs. I added some logs as well to keep track of no of projects being enumerated.
martinlocklear
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this! Some notes:
- Definitely need test coverage for this before merging in.
- I notice in our go.mod we're overriding the version of the library we're using. Is it possible that some of the issues that we've been seeing could be fixed by newer versions of this library?
|
|
||
| projectQueryOptions := &gitlab.ListProjectsOptions{ | ||
| ListOptions: listOpts, | ||
| Membership: gitlab.Ptr(true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we have membership set to true. (I'm not saying we shouldn't, just that I don't understand)
From the docs:
Limit by projects that the current user is a member of.
If this flag is potentially causing problems, why add in this limitation? Why not just scan everything that's visible to the current token, member or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for adding this is that, if we don’t set membership to true for GitLab Cloud, the scan will enumerate all public GitLab projects. This is different from GitHub, where we explicitly restrict what can be scanned with the following condition:
if len(*githubScanOrgs) == 0 && len(*githubScanRepos) == 0 {
return scanMetrics, fmt.Errorf("invalid config: you must specify at least one organization or repository")
}This ensures that GitHub scans are limited to explicitly specified organizations or repositories. In GitLab, however, we don’t currently have an org option. As a result, a scan can run with just a token and attempt to enumerate everything that token has access to.
To prevent this on GitLab Cloud, we set membership=true so that only projects the user is a member of are enumerated. For self-managed GitLab instances, we disable this behavior, allowing the scan to enumerate all projects the token has access to on that instance.
I see two possible approaches:
- Add support for organization/group-based scanning in GitLab (similar to GitHub). With that restriction in place, we could remove the
membershipflag. - Keep the current behavior to avoid unintentionally scanning all public projects on GitLab Cloud.
The second option ensures we don’t accidentally scan the entire set of public GitLab Cloud projects.
25a7098 to
815ac8b
Compare
2e0dc27 to
4696e76
Compare
This PR refactors simplifiedGitlabEnumeration to use the legacy pagination API with keyset pagination instead of Scan2, which is currently in an experimental state. This change is intended as an experiment and does not guarantee a fix for the issue.
Additionally, it introduces retry logic to handle 500-level errors and adds logging to track the number of projects being enumerated.
Related issue on Gitlab: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/7516
Description:
Explain the purpose of the PR.
Checklist:
make test-community)?make lintthis requires golangci-lint)?