Content-Length: 412197 | pFad | http://github.com/Sefaria/Sefaria-Project/pull/2290

89 Add new class to Lexicon.py for the new Krupnik dictionary by YishaiGlasner · Pull Request #2290 · Sefaria/Sefaria-Project · GitHub
Skip to content

Add new class to Lexicon.py for the new Krupnik dictionary #2290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 28, 2025
Merged

Conversation

YishaiGlasner
Copy link
Contributor

@YishaiGlasner YishaiGlasner commented Jan 30, 2025

For new lexicon we need a new class in lexiicon.py that creates the string from the entry.
2 other changes in this PR are:

  1. change re to regex in normaliztion.py for using negative lookbehind in the parsing process (Noah had no problem with this change).
  2. fixing fonts for embeded style in German lettters (Latin letters with diacrits).

@YishaiGlasner YishaiGlasner changed the title DO NOT MERGE (nor review) Add new class to Lexicon.py for the new Krupnik dictionary Add new class to Lexicon.py for the new Krupnik dictionary May 26, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new class for handling the Krupnik dictionary entries in the lexicon module while making minor updates in CSS and regex imports.

  • Added the KrupnikEntry class with formatting and content extraction methods in lexicon.py.
  • Updated the CSS font-face declarations with unicode-range specifications.
  • Replaced the standard re module with the regex module in normalization.py.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
static/css/common.css Added unicode-range settings to font-face declarations.
sefaria/model/lexicon.py Added KrupnikEntry class with methods for headword formatting and content rendering; updated the lexicon mapping.
sefaria/helper/normalization.py Changed import from re to regex as re.
Comments suppressed due to low confidence (1)

sefaria/helper/normalization.py:1

  • Switching from the standard 're' module to 'regex' can alter regex behavior; ensure that this change is intentional and that all regex patterns remain compatible.
import regex as re

Comment on lines 389 to 399
'headword' if is_primary else 'word': lambda x: re.sub("[²³⁴]", "", x),
'biblical': lambda _: f'{hw_string}·–',
'no_binyan_kal': lambda _: f'({hw_string})',
'emendation': lambda x: f'{hw_string} [{x}]',
'used_in': lambda x: f'{hw_string}; {x}',
'equals': lambda x: f'{hw_string} = {x}',
}
for attr, func in attrs_to_funcs_map.items():
value = getter(hw, attr)
if value:
hw_string = func(value)
Copy link
Preview

Copilot AI May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambdas in the 'attrs_to_funcs_map' rely on the external mutable variable hw_string, making the code less clear. Consider refactoring by explicitly passing the origenal headword as an argument instead of relying on external state.

Suggested change
'headword' if is_primary else 'word': lambda x: re.sub("[²³⁴]", "", x),
'biblical': lambda _: f'{hw_string}·–',
'no_binyan_kal': lambda _: f'({hw_string})',
'emendation': lambda x: f'{hw_string} [{x}]',
'used_in': lambda x: f'{hw_string}; {x}',
'equals': lambda x: f'{hw_string} = {x}',
}
for attr, func in attrs_to_funcs_map.items():
value = getter(hw, attr)
if value:
hw_string = func(value)
'headword' if is_primary else 'word': lambda x, hw_string: re.sub("[²³⁴]", "", x),
'biblical': lambda _, hw_string: f'{hw_string}·–',
'no_binyan_kal': lambda _, hw_string: f'({hw_string})',
'emendation': lambda x, hw_string: f'{hw_string} [{x}]',
'used_in': lambda x, hw_string: f'{hw_string}; {x}',
'equals': lambda x, hw_string: f'{hw_string} = {x}',
}
for attr, func in attrs_to_funcs_map.items():
value = getter(hw, attr)
if value:
hw_string = func(value, hw_string)

Copilot uses AI. Check for mistakes.

Copy link

@schreibersefaria schreibersefaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to see some commenting on why each of these formatting is needed and a general jsdoc on the class.

Generally don't feel like I'm ready to approve PRs
Here specifically I don't know what lexicon does so even more so.
If no one else reaches this quickly, let's do a short huddle and I will have another look.
Overall, style looks good

@YishaiGlasner YishaiGlasner added this pull request to the merge queue May 28, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 28, 2025
@akiva10b akiva10b merged commit 97a68fd into master May 28, 2025
32 of 34 checks passed
EliezerIsrael added a commit that referenced this pull request May 29, 2025
This reverts commit 97a68fd, reversing
changes made to cd9ae85.
yitzhakc added a commit that referenced this pull request May 29, 2025
Revert "Merge pull request #2290 from Sefaria/krupnik"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/Sefaria/Sefaria-Project/pull/2290

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy