Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 09-05-2023, 11:08 AM   #1
Woodssi
Enthusiast
Woodssi began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Sep 2011
Location: Barrhead, Scotland
Device: Kindle Paperwhite (2)
Custom Search Parameters

Ok folks, question for the Devs or the super users of the application.

Is it possible to set up a custom search?

To outline...
When reading, I absolutely hate punctuation marks that are italicised
(this is a personal quirk, so no discussion or comment required please).

When editing my books, is there a way to have Sigil search for
:-
<span class="italic"> any punctuation mark </span>

My expertise level of using Sigil is such that I only ever use the 'Find' box, and I've just started reading a title that has quite a few whole chapters (letters written by a character) that are fully in italics.

It would save me an enormous amount of time if I could search in the manner I have described.

Thanks in advance.
Woodssi is offline   Reply With Quote
Old 09-05-2023, 11:34 AM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,727
Karma: 5444398
Join Date: Nov 2009
Device: many
It is called regular expression find and replace. And Sigil can do that already. Check out the sticky thread about regular expressions and of course there is help in the Sigil User's guide as well.
KevinH is offline   Reply With Quote
Old 09-05-2023, 03:24 PM   #3
Woodssi
Enthusiast
Woodssi began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Sep 2011
Location: Barrhead, Scotland
Device: Kindle Paperwhite (2)
Much obliged, Kevin.

I think I found one that suits my purpose <span class="italics">[^<]*\s.*</span>

But... How do I use this?
I've never done anything like it before.

Can you point me in right direction, please.

Last edited by Woodssi; 09-05-2023 at 03:34 PM.
Woodssi is offline   Reply With Quote
Old 09-05-2023, 07:48 PM   #4
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Woodssi View Post
When reading, I absolutely hate punctuation marks that are italicised

When editing my books, is there a way to have Sigil search for
:-
<span class="italic"> any punctuation mark </span>
This is the regular expression I use:

Find: ([;’”,\)\]\.—])</span>
Replace: </span>\1

What it does is:
  • Find any of the punctuation listed inside the parentheses (before the </span>)
  • Shoves it OUTSIDE the </span>

So it would take something like this:

Code:
This is a <span class="italics">A Book Title.</span>
“What did you say to me you piece of <span class="italics">crap?”</span>
and converts it into:

Code:
This is a <span class="italics">A Book Title</span>.
“What did you say to me you piece of <span class="italics">crap</span>?”
You can also tweak it to whatever tags are needed, like one which will:

Find all Italics Ending Punctuation

Find: ([;’”,\)\]\.—])</i>
Replace: </i>\1

or:

Find all Italics Beginning Punctuation

Find: <i>([‘“\(—])
Replace: \1<i>

This will help correct things like:

Code:
<i>“Example Book</i> by First Last was the greatest book <em>ever!”</em>
by changing it to the correct:

Code:
“<i>Example Book</i> by First Last was the greatest book <em>ever</em>!”
- - -

Side Note: For more info, see my posts in:

If you need even more regex cleanup tips, see my posts in:

and if you need even more, type this into your favorite search engine:

Code:
Tex2002ans regex site:mobileread.com
Tex2002ans regular expression site:mobileread.com
I've given hundred of topics of examples, even lots with color-coded, step-by-step explanations.

Last edited by Tex2002ans; 09-05-2023 at 07:56 PM.
Tex2002ans is offline   Reply With Quote
Old 09-05-2023, 08:55 PM   #5
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,727
Karma: 5444398
Join Date: Nov 2009
Device: many
Check out the Find and Replace chapter in the Sigil User's Guide. There is also a Tutorial chapter on Advanced Find there as well.

You can download the Sigil User's guide as an epub from:

https://github.com/Sigil-Ebook/sigil...guide/releases
KevinH is offline   Reply With Quote
Old 09-05-2023, 10:30 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,596
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I find using the Unicode categories much easier/cleaner.

Any punctuation character can be matched with \p{P}

\p{P} or \p{Punctuation}: any kind of punctuation character.
\p{Pd} or \p{Dash_Punctuation}: any kind of hyphen or dash.
\p{Ps} or \p{Open_Punctuation}: any kind of opening bracket.
\p{Pe} or \p{Close_Punctuation}: any kind of closing bracket.
\p{Pi} or \p{Initial_Punctuation}: any kind of opening quote.
\p{Pf} or \p{Final_Punctuation}: any kind of closing quote.
\p{Pc} or \p{Connector_Punctuation}: a punctuation character such as an underscore that connects words.
\p{Po} or \p{Other_Punctuation}: any kind of punctuation character that is not a dash, bracket, quote or connector.
DiapDealer is offline   Reply With Quote
Old 09-07-2023, 08:17 AM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,542
Karma: 129670952
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Tex2002ans View Post
This is the regular expression I use:...
How about regex that will take...
<i>"some, stuff? with, punctuation"</i>
and convert it to...
"<i>some, stuff</i>? <i>with</i>, <i>punctuation</i>"
JSWolf is offline   Reply With Quote
Old 09-07-2023, 09:33 AM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,596
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Regex ain't magic.
DiapDealer is offline   Reply With Quote
Old 09-07-2023, 10:50 AM   #9
nabsltd
Evangelist
nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.nabsltd ought to be getting tired of karma fortunes by now.
 
Posts: 417
Karma: 6913952
Join Date: Aug 2013
Location: Hamden, CT
Device: Kindle Paperwhite (11th gen), Scribe
Quote:
Originally Posted by DiapDealer View Post
Regex ain't magic.
An iterative application of a regex search and replace could pull all punctuation outside of a certain tag, but not like his example, since there are commas both inside and outside of the tag in the end result.

That said, tags tend to have semantic value, so any fully automated system will definitely get it wrong at times. <em></em> used like quotes around a thought should follow the same rules as quotation marks, so punctuation goes inside. Emphasizing just a word or phrase shouldn't have the tags include starting and ending punctuation.

Getting these right is by far the most time-consuming part of my fix of CSS from commercial eBooks.
nabsltd is offline   Reply With Quote
Old 09-07-2023, 10:55 AM   #10
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,119
Karma: 18727091
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
Quote:
Originally Posted by JSWolf View Post
How about regex that will take...
<i>"some, stuff? with, punctuation"</i>
and convert it to...
"<i>some, stuff</i>? <i>with</i>, <i>punctuation</i>"
Spoiler:


lol....If I didn't know you were kidding I'd be hurling all over the floor!!


I totally agree however... who cares whether punctuation is inside or outside the italics?!?! Honestly, if you are worried about that nitnoid stuff you are WAY more advanced than me....I just don't have the time to worry about that.



Luckily the OP said "this is a personal quirk, so no discussion or comment required please." Now I feel comfortable in NOT commenting on it...


Cheers!
Turtle91 is offline   Reply With Quote
Old 09-07-2023, 07:54 PM   #11
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DiapDealer View Post
Regex ain't magic.


Quote:
Originally Posted by JSWolf View Post
How about regex that will take...
<i>"some, stuff? with, punctuation"</i>
and convert it to...
"<i>some, stuff</i>? <i>with</i>, <i>punctuation</i>"
Then you run those 2 I listed to take care of the easy cases.

Then you can use the fantastic functionality added in recent Sigil versions to get "Italic Lists", which can list everything between two:
  • <i></i>

From there, you can do whatever extra tweaks are needed.

For more info on that, see my fantastic descriptions of the workflow back in:
  • 2023: "Semantic markup question!"
    • Follow the "Do I Put Spaces Inside Italics/Emphasis?" + "Does Punctuation Go Inside the <i> or <em>?" + especially the "What's This Text? <i> or <em>?", where I linked to my:

That allows you to quickly list all HTML that matches your regex into a simple to understand/search/sort list.

I described how that can be used to quickly map all:
  • italics <-> emphasis
  • acronyms/ALL CAPS <-> smallcaps
  • editing/marking dialogue tags

or many other helpful "mass editing" workflows.

The second you sort into a list, the huge ones with lots of punctuation will instantly stand out like a sore thumb:

Code:
<i>Enciclopedia Italiana</i>
<i>New York Times</i>
<i>This sentence is very long? And has lots, and lots, and lots of punctuation inside?</i>
<i>Wall Street Journal</i>
<i>Washington Post</i>
<i>individual</i>
<i>laissez-faire</i>
<i>negative</i>

Last edited by Tex2002ans; 09-07-2023 at 08:10 PM.
Tex2002ans is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Set parameters on download metadata to only search for large covers FacetiousKnave Library Management 8 12-10-2022 07:55 PM
I have a problem with custom search. wan1967 Library Management 4 11-04-2022 10:20 AM
Search Custom Columns Majik Library Management 2 12-10-2019 02:14 PM
Search using custom column macnab69 Library Management 4 05-19-2013 12:33 PM
Feature Wish: Save Search Parameters BookwormDragon Calibre 22 04-09-2010 05:31 AM


All times are GMT -4. The time now is 01:19 PM.


MobileRead.com is a privately owned, operated and funded community.