Skip to content

MissingRemoteId: also check EGIT_REPO_URI#643

Draft
parona-source wants to merge 1 commit intopkgcore:masterfrom
parona-source:EGIT_REPO_URI_remoteid
Draft

MissingRemoteId: also check EGIT_REPO_URI#643
parona-source wants to merge 1 commit intopkgcore:masterfrom
parona-source:EGIT_REPO_URI_remoteid

Conversation

@parona-source
Copy link
Contributor

@parona-source parona-source commented Dec 22, 2023

Making it a draft as I'd like input if on if there is a better way to do to get the value of EGIT_REPO_URI with pkgcore's API.

Also I'm unsure if I correctly added the test, also a test scenario where EGIT_REPO_URI isnt set in the ebuild itself would be good to add.

Keep in mind that I want get the value of the variable post inherit as eclasses like kde.org set it as well.

@parona-source parona-source marked this pull request as ready for review December 25, 2023 11:16
urls = urls.union(pkg.homepage)

if "git-r3" in pkg.inherited and hasattr(pkg, "environment"):
egit_repo_uri = re.compile(r"EGIT_REPO_URI=\"(.*)\"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally speaking, what you're up to here isn't safe due to bash being 'bash'. Minimally that has to be anchored to start/endline.

Personally, I'd rather not do regex on bash since it's not robust. EBD could be tweaked to spit back arbitrary requested variables which is far safer- as in, do the parse, then pull the raw value out of the bash env post render.

There may be other tooling for this that pkgcore/pkgcheck uses that I'm unaware of, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be happy with this now?

egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$")

Copy link
Contributor

@ferringb ferringb Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume I'm never happy due to personality flaws. :)

egit_repo_uri = re.compile(r"^declare -[^ ]* EGIT_REPO_URI=([\"'])(?<content>.*)\1$")

Is probably tighter, albeit untested;

  • the declare part ignores if it's exported. It also ignores if it's an array... which is fine to ignore. Even I won't boil that particular ocean. :)
  • note the (?P<content>- named group. I suggest this since you should move this compile as a class var; in doing so, you want to disconnect any hardcoded regex positional assumptions, and use a named capture instead.

Note; you're hardcoding the whitespace usage. If this ever goes to tabs, etc. That problem is out of your scope- this is something that should be implemented centrally so any consumer, like you, just can iterate through "here's all the vars defined in that ebuild's rendered scope". That central implementation is the one that should be explicitly specific and anal about things I'm mentioning; we shouldn't do that in end consumer implementations like this, however.

Does that make sense? If so, any interest in tackling it as a seperate PR after this one? If not, just let me know and I'll ticket this for future work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make sense? If so, any interest in tackling it? If not, just let me know and I'll ticket this for future work.

Yes it does. And yeah I can look into this. Another change to dive into pkgcheck internals :)

Copy link
Contributor

@ferringb ferringb Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a ticket laying out what you'd want in terms of API capabilities? I have notions enumerated below, but you're writing checks- I don't. The backend API's are where I stick to. Said plainly: I may be very much talking out of my ass, even if I'm open to wiring what you're asking for.

My rough thoughts here:

  • you clearly want a mapping of variables. Id' think the value objects should also have attributes exposing the bash bits like array or export, so y'all can write QA against "this shouldn't be exported" or "this variables typing must be array".

  • Is a mapping of functions actually useful? I think I've seen checks that try to detect if some "you must never override this phase" eclass export was overriden. NFC, but if a mapping of functions is added- down the line- I suspect adding an exposure of "this is the chain of overrides" might be useful. Or not; again, I just do internals, I don't write checks.

    If it's not immediately useful, I'd just suggest the object api of the parsed environment is designed for this to be added. IE, evolve the api capabilities; it doesn't have to be 'everything' up front.

  • I loathe that this shouldn't be part of the pkgcore API itself. Pkgcore is a package manager- thus minimal deps are required for it. Tree-sitter isn't a viable dep IMO, and ultimately, what you're doing should be moved from regex to tree-sitter to deal w/ bash being 'bash' in regards to ways to do stuff. The regex I proposed is solid- tree-sitter just ensures we're not sensitive to bash changes like this, thus why I think it's the end implementation, even if v1 is regex based.

    TL;DR: unless @thesamesam @arthurzam @mgorny are good w/ a tree-sitter dep in pkgcore, the addition we're talking about here would have to be at the pkgcheck layer. I'd suggest injecting a pkg wrapper for results from repos that laces this in; basically you'll extend the pkg object's exposed API. If this doesn't make sense, ask- it's buried, but it's something we do in a lot of places for situations like this.

^^^ that's my thoughts coming at this from the guts/internals standpoint, and guessing about what y'all who write checks may want. Feel free to ignore/expand/whatever :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, to be clear- thanks for whatever you can suggest. Checks are never going to be simple, but the more we can provide a solid API- and then document the shit out of it and how to write checks- the better.

Copy link
Contributor

@ferringb ferringb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly way the fuck late, but please give a repro command... preferably for modern pkgcheck. :)

I ran a quick pkgcheck against some ebuilds that had git-r3 as an inheritance- I could not see any reports generated related to this. IE, I need a modern repro please, unless this is no longer an issue..

Signed-off-by: Alfred Wingate <parona@protonmail.com>
@parona-source
Copy link
Contributor Author

parona-source commented Feb 19, 2026

One notable place where this useful is the kde overlay. A lot of the packages miss this remote.

https://github.com/gentoo/kde

(Limiting category as there are quite a lot missing this)

(pkgcheck-egit) ask@bigglane /home/ask/sources/kde.d/_ $ pkgcheck scan -k MissingRemoteId media-gfx/
media-gfx/kgeotag
  MissingRemoteId: missing <remote-id type="kde-invent">graphics/kgeotag</remote-id> (inferred from URI 'https://invent.kde.org/graphics/kgeotag')

media-gfx/kxstitch
  MissingRemoteId: missing <remote-id type="kde-invent">graphics/kxstitch</remote-id> (inferred from URI 'https://invent.kde.org/graphics/kxstitch')

media-gfx/kgraphviewer
  MissingRemoteId: missing <remote-id type="kde-invent">graphics/kgraphviewer</remote-id> (inferred from URI 'https://invent.kde.org/graphics/kgraphviewer')

media-gfx/skanpage
  MissingRemoteId: missing <remote-id type="kde-invent">utilities/skanpage</remote-id> (inferred from URI 'https://invent.kde.org/utilities/skanpage')

Addressing these in the repo would then make it easier to update them in ::gentoo and such would make my life easier by showing a nice invent.kde.org link for the packages in packages.gentoo.org.

Copy link
Contributor

@ferringb ferringb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive by obesrvations/suggestions. The hasattr I'm particularly interested in.

The actual benefit of this, etc- that I'm not evaluating, just the implementation.

urls = set(filter(self.__filter_url, all_urls))
urls = sorted(urls.union(pkg.homepage), key=len)

if "git-r3" in pkg.inherited and hasattr(pkg, "environment"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is hasattr needed here?

I don't know this codepath fully- is there an implication that PackageRepoSource isn't guaranteed to surface the environment text data? Because from the pkgcore side, it should always be accessible afaik.

And if it isn't, I believe that is a bug and should be fixed.

if "git-r3" in pkg.inherited and hasattr(pkg, "environment"):
egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$")
for env_line in pkg.environment.data.splitlines():
result = re.search(egit_repo_uri, env_line)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if result := re.search(egit_repo_uri, env_line):
  urls.append ...

urls = sorted(urls.union(pkg.homepage), key=len)

if "git-r3" in pkg.inherited and hasattr(pkg, "environment"):
egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$")
Copy link
Contributor

@ferringb ferringb Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're hardcoding that bash currently uses " for these directives, and that the export statement doesn't inject crap like actual 'export' (which the env saving implementation should- declare -e iirc, if it's marked as exported). Any of those will break this; a code comment note, and/or an expansion of the regex is warranted.

For the regex, this should be moved to a class var; it both is easier to spot it, and it avoids re.compile relying on it's internal LRU to avoid the recompile. If it must recompile, it's surprisingly not cheap.

Your code here implies pkgcore should provide a better api for this, also.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also- considering that regex has mild complexity to get it right- as I mentioned for things like export or array- perhaps this should be broke out in a later PR as a util func, and other regex's rebased to it?

# Third generation eclass for easing maintenance of live ebuilds using
# git as remote repository.

case ${EAPI} in
Copy link
Contributor

@ferringb ferringb Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is out of scope for your PR, just an observation.

I suspect we should inject a function into ebuild/eclass environment tests that is require_eapi $@. It'll reduce this sort of boilerplate. It's our tests, this sort of helper would be fine.

What's others thoughts? And are there are other areas we should do this? Having the capability I'd expect would lead to finding usage.

IE, if we build the injection- is it just for this, or are we improving the general case of our tests?

@parona-source parona-source marked this pull request as draft February 19, 2026 21:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments