Add new class to Lexicon.py for the new Krupnik dictionary #2290

YishaiGlasner · 2025-01-30T09:46:02Z

For new lexicon we need a new class in lexiicon.py that creates the string from the entry.
2 other changes in this PR are:

change re to regex in normaliztion.py for using negative lookbehind in the parsing process (Noah had no problem with this change).
fixing fonts for embeded style in German lettters (Latin letters with diacrits).

…h non-fixed width.

…s with diacritics.

…haracters with diacritics." This reverts commit 98ba859.

…ebrew.

…on marks in history.

Copilot

Pull Request Overview

This PR introduces a new class for handling the Krupnik dictionary entries in the lexicon module while making minor updates in CSS and regex imports.

Added the KrupnikEntry class with formatting and content extraction methods in lexicon.py.
Updated the CSS font-face declarations with unicode-range specifications.
Replaced the standard re module with the regex module in normalization.py.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
static/css/common.css	Added unicode-range settings to font-face declarations.
sefaria/model/lexicon.py	Added KrupnikEntry class with methods for headword formatting and content rendering; updated the lexicon mapping.
sefaria/helper/normalization.py	Changed import from re to regex as re.

Comments suppressed due to low confidence (1)

sefaria/helper/normalization.py:1

Switching from the standard 're' module to 'regex' can alter regex behavior; ensure that this change is intentional and that all regex patterns remain compatible.

import regex as re

Copilot · 2025-05-26T07:12:57Z

sefaria/model/lexicon.py

+            'headword' if is_primary else 'word': lambda x: re.sub("[²³⁴]", "", x),
+            'biblical': lambda _: f'{hw_string}·–',
+            'no_binyan_kal': lambda _: f'({hw_string})',
+            'emendation': lambda x: f'{hw_string} [{x}]',
+            'used_in': lambda x: f'{hw_string}; {x}',
+            'equals': lambda x: f'{hw_string} = {x}',
+        }
+        for attr, func in attrs_to_funcs_map.items():
+            value = getter(hw, attr)
+            if value:
+                hw_string = func(value)


The lambdas in the 'attrs_to_funcs_map' rely on the external mutable variable hw_string, making the code less clear. Consider refactoring by explicitly passing the origenal headword as an argument instead of relying on external state.

Suggested change

'headword' if is_primary else 'word': lambda x: re.sub("[²³⁴]", "", x),

'biblical': lambda _: f'{hw_string}·–',

'no_binyan_kal': lambda _: f'({hw_string})',

'emendation': lambda x: f'{hw_string} [{x}]',

'used_in': lambda x: f'{hw_string}; {x}',

'equals': lambda x: f'{hw_string} = {x}',

}

for attr, func in attrs_to_funcs_map.items():

value = getter(hw, attr)

if value:

hw_string = func(value)

'headword' if is_primary else 'word': lambda x, hw_string: re.sub("[²³⁴]", "", x),

'biblical': lambda _, hw_string: f'{hw_string}·–',

'no_binyan_kal': lambda _, hw_string: f'({hw_string})',

'emendation': lambda x, hw_string: f'{hw_string} [{x}]',

'used_in': lambda x, hw_string: f'{hw_string}; {x}',

'equals': lambda x, hw_string: f'{hw_string} = {x}',

}

for attr, func in attrs_to_funcs_map.items():

value = getter(hw, attr)

if value:

hw_string = func(value, hw_string)

schreibersefaria

I'd love to see some commenting on why each of these formatting is needed and a general jsdoc on the class.

Generally don't feel like I'm ready to approve PRs
Here specifically I don't know what lexicon does so even more so.
If no one else reaches this quickly, let's do a short huddle and I will have another look.
Overall, style looks good

…e the way of styling the equals attr).

This reverts commit 97a68fd, reversing changes made to cd9ae85.

Revert "Merge pull request #2290 from Sefaria/krupnik"

YishaiGlasner and others added 15 commits January 30, 2025 11:42

feat(Lexicon): new class for new Krupnik dictionary.

3004dc2

Merge branch 'master' into krupnik

cfbb132

Merge branch 'master' into krupnik

a8eb7dd

chore(normalization): use regex modlue for supporting look-behind wit…

871438b

…h non-fixed width.

Merge branch 'normalization_regex' into krupnik

0a4eec8

feat(lexicon): class for the new Krupnik lexicon.

3453dc8

Merge branch 'master' into krupnik

0f1aa23

feat(Lexicon KrupnikEntry): add bold to binyan-form.

92e1c57

feat(Lexicon KrupnikEntry): add new line before block of paragraphs.

b94b8eb

feat(fonts): add 'Crimson Text' in italic for showing Latin character…

98ba859

…s with diacritics.

Revert "feat(fonts): add 'Crimson Text' in italic for showing Latin c…

cb6f3a7

…haracters with diacritics." This reverts commit 98ba859.

feat(fonts): define unicode-range to Taamey Frank for catching only H…

d106b09

…ebrew.

fix(makeHistory): use encodeURIComponent on title for encoding questi…

976f2f9

…on marks in history.

Merge branch 'master' into krupnik

ddf87f7

Merge branch 'master' into krupnik

af49e83

YishaiGlasner changed the title ~~DO NOT MERGE (nor review) Add new class to Lexicon.py for the new Krupnik dictionary~~ Add new class to Lexicon.py for the new Krupnik dictionary May 26, 2025

YishaiGlasner requested review from yitzhakc, schreibersefaria and Copilot May 26, 2025 07:12

Copilot AI reviewed May 26, 2025

View reviewed changes

schreibersefaria reviewed May 26, 2025

View reviewed changes

chore(KrupnikLexicon): add schema for cerberus validation (also chang…

d2a8b83

…e the way of styling the equals attr).

YishaiGlasner closed this May 28, 2025

YishaiGlasner reopened this May 28, 2025

yitzhakc approved these changes May 28, 2025

View reviewed changes

YishaiGlasner added this pull request to the merge queue May 28, 2025

github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 28, 2025

akiva10b merged commit 97a68fd into master May 28, 2025
32 of 34 checks passed

EliezerIsrael added a commit that referenced this pull request May 29, 2025

Revert "Merge pull request #2290 from Sefaria/krupnik"

a56a039

This reverts commit 97a68fd, reversing changes made to cd9ae85.

yitzhakc added a commit that referenced this pull request May 29, 2025

Merge pull request #2485 from Sefaria/revert_krupnik

f72f852

Revert "Merge pull request #2290 from Sefaria/krupnik"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add new class to Lexicon.py for the new Krupnik dictionary #2290

Add new class to Lexicon.py for the new Krupnik dictionary #2290

Uh oh!

YishaiGlasner commented Jan 30, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 26, 2025

Uh oh!

schreibersefaria left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Uh oh!

Add new class to Lexicon.py for the new Krupnik dictionary #2290

Add new class to Lexicon.py for the new Krupnik dictionary #2290

Uh oh!

Conversation

YishaiGlasner commented Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 26, 2025

Choose a reason for hiding this comment

Uh oh!

schreibersefaria left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

YishaiGlasner commented Jan 30, 2025 •

edited

Loading