MissingRemoteId: also check EGIT_REPO_URI#643
MissingRemoteId: also check EGIT_REPO_URI#643parona-source wants to merge 1 commit intopkgcore:masterfrom
Conversation
da2225e to
e4431da
Compare
src/pkgcheck/checks/metadata_xml.py
Outdated
| urls = urls.union(pkg.homepage) | ||
|
|
||
| if "git-r3" in pkg.inherited and hasattr(pkg, "environment"): | ||
| egit_repo_uri = re.compile(r"EGIT_REPO_URI=\"(.*)\"") |
There was a problem hiding this comment.
generally speaking, what you're up to here isn't safe due to bash being 'bash'. Minimally that has to be anchored to start/endline.
Personally, I'd rather not do regex on bash since it's not robust. EBD could be tweaked to spit back arbitrary requested variables which is far safer- as in, do the parse, then pull the raw value out of the bash env post render.
There may be other tooling for this that pkgcore/pkgcheck uses that I'm unaware of, however.
There was a problem hiding this comment.
Would you be happy with this now?
egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$")
There was a problem hiding this comment.
assume I'm never happy due to personality flaws. :)
egit_repo_uri = re.compile(r"^declare -[^ ]* EGIT_REPO_URI=([\"'])(?<content>.*)\1$")
Is probably tighter, albeit untested;
- the declare part ignores if it's exported. It also ignores if it's an array... which is fine to ignore. Even I won't boil that particular ocean. :)
- note the
(?P<content>- named group. I suggest this since you should move this compile as a class var; in doing so, you want to disconnect any hardcoded regex positional assumptions, and use a named capture instead.
Note; you're hardcoding the whitespace usage. If this ever goes to tabs, etc. That problem is out of your scope- this is something that should be implemented centrally so any consumer, like you, just can iterate through "here's all the vars defined in that ebuild's rendered scope". That central implementation is the one that should be explicitly specific and anal about things I'm mentioning; we shouldn't do that in end consumer implementations like this, however.
Does that make sense? If so, any interest in tackling it as a seperate PR after this one? If not, just let me know and I'll ticket this for future work.
There was a problem hiding this comment.
Does that make sense? If so, any interest in tackling it? If not, just let me know and I'll ticket this for future work.
Yes it does. And yeah I can look into this. Another change to dive into pkgcheck internals :)
There was a problem hiding this comment.
Can you open a ticket laying out what you'd want in terms of API capabilities? I have notions enumerated below, but you're writing checks- I don't. The backend API's are where I stick to. Said plainly: I may be very much talking out of my ass, even if I'm open to wiring what you're asking for.
My rough thoughts here:
-
you clearly want a mapping of variables. Id' think the value objects should also have attributes exposing the bash bits like array or export, so y'all can write QA against "this shouldn't be exported" or "this variables typing must be array".
-
Is a mapping of functions actually useful? I think I've seen checks that try to detect if some "you must never override this phase" eclass export was overriden. NFC, but if a mapping of functions is added- down the line- I suspect adding an exposure of "this is the chain of overrides" might be useful. Or not; again, I just do internals, I don't write checks.
If it's not immediately useful, I'd just suggest the object api of the parsed environment is designed for this to be added. IE, evolve the api capabilities; it doesn't have to be 'everything' up front.
-
I loathe that this shouldn't be part of the pkgcore API itself. Pkgcore is a package manager- thus minimal deps are required for it. Tree-sitter isn't a viable dep IMO, and ultimately, what you're doing should be moved from regex to tree-sitter to deal w/ bash being 'bash' in regards to ways to do stuff. The regex I proposed is solid- tree-sitter just ensures we're not sensitive to bash changes like this, thus why I think it's the end implementation, even if v1 is regex based.
TL;DR: unless @thesamesam @arthurzam @mgorny are good w/ a tree-sitter dep in pkgcore, the addition we're talking about here would have to be at the pkgcheck layer. I'd suggest injecting a pkg wrapper for results from repos that laces this in; basically you'll extend the pkg object's exposed API. If this doesn't make sense, ask- it's buried, but it's something we do in a lot of places for situations like this.
^^^ that's my thoughts coming at this from the guts/internals standpoint, and guessing about what y'all who write checks may want. Feel free to ignore/expand/whatever :)
There was a problem hiding this comment.
Also, to be clear- thanks for whatever you can suggest. Checks are never going to be simple, but the more we can provide a solid API- and then document the shit out of it and how to write checks- the better.
ferringb
left a comment
There was a problem hiding this comment.
Admittedly way the fuck late, but please give a repro command... preferably for modern pkgcheck. :)
I ran a quick pkgcheck against some ebuilds that had git-r3 as an inheritance- I could not see any reports generated related to this. IE, I need a modern repro please, unless this is no longer an issue..
Signed-off-by: Alfred Wingate <parona@protonmail.com>
e4431da to
8572940
Compare
|
One notable place where this useful is the kde overlay. A lot of the packages miss this remote. (Limiting category as there are quite a lot missing this) Addressing these in the repo would then make it easier to update them in ::gentoo and such would make my life easier by showing a nice invent.kde.org link for the packages in packages.gentoo.org. |
ferringb
left a comment
There was a problem hiding this comment.
drive by obesrvations/suggestions. The hasattr I'm particularly interested in.
The actual benefit of this, etc- that I'm not evaluating, just the implementation.
| urls = set(filter(self.__filter_url, all_urls)) | ||
| urls = sorted(urls.union(pkg.homepage), key=len) | ||
|
|
||
| if "git-r3" in pkg.inherited and hasattr(pkg, "environment"): |
There was a problem hiding this comment.
why is hasattr needed here?
I don't know this codepath fully- is there an implication that PackageRepoSource isn't guaranteed to surface the environment text data? Because from the pkgcore side, it should always be accessible afaik.
And if it isn't, I believe that is a bug and should be fixed.
| if "git-r3" in pkg.inherited and hasattr(pkg, "environment"): | ||
| egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$") | ||
| for env_line in pkg.environment.data.splitlines(): | ||
| result = re.search(egit_repo_uri, env_line) |
There was a problem hiding this comment.
if result := re.search(egit_repo_uri, env_line):
urls.append ...
| urls = sorted(urls.union(pkg.homepage), key=len) | ||
|
|
||
| if "git-r3" in pkg.inherited and hasattr(pkg, "environment"): | ||
| egit_repo_uri = re.compile(r"^declare -- EGIT_REPO_URI=\"(.*)\"$") |
There was a problem hiding this comment.
You're hardcoding that bash currently uses " for these directives, and that the export statement doesn't inject crap like actual 'export' (which the env saving implementation should- declare -e iirc, if it's marked as exported). Any of those will break this; a code comment note, and/or an expansion of the regex is warranted.
For the regex, this should be moved to a class var; it both is easier to spot it, and it avoids re.compile relying on it's internal LRU to avoid the recompile. If it must recompile, it's surprisingly not cheap.
Your code here implies pkgcore should provide a better api for this, also.
There was a problem hiding this comment.
Also- considering that regex has mild complexity to get it right- as I mentioned for things like export or array- perhaps this should be broke out in a later PR as a util func, and other regex's rebased to it?
| # Third generation eclass for easing maintenance of live ebuilds using | ||
| # git as remote repository. | ||
|
|
||
| case ${EAPI} in |
There was a problem hiding this comment.
This is out of scope for your PR, just an observation.
I suspect we should inject a function into ebuild/eclass environment tests that is require_eapi $@. It'll reduce this sort of boilerplate. It's our tests, this sort of helper would be fine.
What's others thoughts? And are there are other areas we should do this? Having the capability I'd expect would lead to finding usage.
IE, if we build the injection- is it just for this, or are we improving the general case of our tests?
Making it a draft as I'd like input if on if there is a better way to do to get the value of EGIT_REPO_URI with pkgcore's API.
Also I'm unsure if I correctly added the test, also a test scenario where EGIT_REPO_URI isnt set in the ebuild itself would be good to add.
Keep in mind that I want get the value of the variable post inherit as eclasses like kde.org set it as well.