Skip to content

Add HTML element selector to recipes for sliming down page content to relevant sections only#182

Merged
rsaksida merged 2 commits intomainfrom
feat/selective-page-elements
Mar 5, 2026
Merged

Add HTML element selector to recipes for sliming down page content to relevant sections only#182
rsaksida merged 2 commits intomainfrom
feat/selective-page-elements

Conversation

@alexculealt
Copy link
Collaborator

This PR implements focused HTML element selector used to generate the simplified markdown helping cut down tokens (extraction costs) and speed during extractions.

Closes #178

image

@alexculealt alexculealt requested a review from rsaksida February 23, 2026 19:23
export async function simplifiedMarkdown(html: string, contentSelector?: string) {
let workingHtml = html;
if (contentSelector?.trim()) {
const $ = cheerio.load(html);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already load the HTML with cheerio in simplifyHtml. Can't this be done in simplifyHtml?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adjusted, thanks for the suggestion!

@rsaksida rsaksida merged commit 26c5487 into main Mar 5, 2026
@rsaksida rsaksida deleted the feat/selective-page-elements branch March 5, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add targeting of specific HTML elements to extract data from into the recepies

2 participants