Skip to content

Setup: Everything Else

Overview

This page is dedicated to providing resources on how to do the following:

  1. Getting the actual text to use Yomichan on.
  2. Getting the pictures and/or sentence audio from the media.

There are plenty of well established resources out there on how to do just that, ranging from software to written & video guides. Instead of repeating what others have already said, those programs and guides will be linked.

If you are looking to setup jp-mining-note, see this page instead.

Note

If you already have a sentence mining workflow, you can likely skip to this section.


Troubleshooting & Support

If you are having troubles with any of the guides or programs below, I unfortunately will not be able to provide very detailed support.

Instead, I would recommend that you contact the creators of the guides / programs, or the communities surrounding said guides / programs.

Additionally, the guides listed here usually do not use JPMN, and instead link to other note types. This shouldn't be an issue as long as you change the appropriate the field names.


Getting the Text to Create the Cards

I use a texthooker setup, which is able to extract subtitles or text into the browser. Once the text is on the browser, you can use Yomichan to select the word and create the Anki card (by clicking on the green plus button).

The standard texthooker setup works for most games, and any show with subtitle files.


Texthooker: Websocket based

These pages display the hooked content, where the hooked content is communicated via Websockets. Websocket based texthookers are better than the classic clipboard-based texthookers in almost every aspect:

  • They are generally faster and more reliable.
  • They do not flood your clipboard.
  • They do not require an extension that constantly polls the clipboard.

However, it requires more specialized coordination between programs.

Resources (click here)
  • Renji's Texthooker Page (recommended)

    • I use these settings to make the text more compressed.
    • This texthooker page comes with built in support for both websockets and clipboard inserter plugins.
  • exSTATic (recommended for stats lovers)

    • Its primary use is for automatic stats collection and visualizing said statistics.
    • Integrates seamlessly with many workflows.
    • Uses a custom texthooker page, which connects with Textractor with its own custom extension.
    • A video installation guide is available on the project's README page.

Supported Workflows:

Legacy Resources (click here)

These resources are considered legacy, and I highly recommend using the standard resources above in favor of these.

  • Marv's Websocket Userscript

    • A more featureful version of the patch below.
    • Written for Anacreon's texthooker page.
  • Zetta's Custom Patch

    • Patch Instructions for existing clipboard-based texthookers.
    • This patch is intended to be used in conjunction with this Textractor extension.
    • This patch was written for Anacreon's texthooker page. However, it will likely work for most other texthooker pages.
    Instructions to use the patch (click here)

    Warning

    This is a monkey patch, even according to the author. Now that better alternatives have came out (see above), I recommend to use said alternatives.

    1. Download your favorite texthooker page into a raw html file.
    2. Copy/paste the code below to the very end of the raw html file.
    3. If you are currently viewing the page, refresh.

    <script>
      let socket = null;
      let wsStatusElem = null;
    
      const createStatusElem = () => {
        wsStatusElem = document.createElement("span")
        let node = document.getElementById('menu').firstChild
        wsStatusElem.setAttribute("class", "menuitem")
        wsStatusElem.addEventListener('click', (e) => {
          if(wsStatusElem.innerText == "Reconnect") {
            connect()
          }
        })
        node.insertBefore(wsStatusElem, node.firstChild)
      }
    
      const updateStatus = (connected) => {
        if(wsStatusElem === null) { createStatusElem() }
        wsStatusElem.innerText = connected ? "Connected" : "Reconnect"
        wsStatusElem.style.cssText = "margin-right: 1.5em; display: inline-block;"
        wsStatusElem.style.cssText += connected ? "color:rgb(24, 255, 24);" : "color:rgb(255, 24, 24);"
      }
    
      const connect = () => {
        socket = new WebSocket("ws://localhost:6677/")
        socket.onopen = (e) => { updateStatus(true) }
        socket.onclose = (e) => { updateStatus(false) }
        socket.onerror = (e) => { updateStatus(false); console.log(`[error] ${e.message}`) }
        socket.onmessage = (e) => {
          let container = document.getElementById('textlog')
          let textNode = document.createElement("p")
          textNode.innerText = e.data
          document.body.insertBefore(textNode, null)
        }
      }
      connect()
    </script>
    
    (Original discord message, on TMW server. Thanks Zetta#3033 for the code.)

Texthooker: Clipboard based

These pages display the hooked content, where the hooked content is communicated via automated clipboard (copy/paste) tools. Most classic setups documented are for clipboard based texthooker pages.

Resources (click here)
Guides (click here)


Game-Like Content: Getting Text

The following are primarily for text-heavy games, such as visual novels.

Resources (click here)
  • Textractor (recommended)
  • agent
    • This is a good fallback for when Textractor doesn't work
Guides (click here)

Video Content: Getting Text, Sentence Audio, Picture

Video content includes streamed content (Youtube, Netflix, etc.) and locally downloaded files.

Resources (click here)
  • mpvacious (recommended for downloaded videos / if you are using mpv)
    • Add-on for MPV, a cross platform media player. Personally tested.
    • Basically universal codec support since it uses mpv.
    • This addon has capabilities to extract the video clip itself as the form of a gif (autoplayable webp).
  • asbplayer (recommended for streamed sites)
    • Cross platform (chromium) browser video player. Personally tested.
    • Codec support is based on the browser used.
    • Works on video streaming sites, as well as downloaded videos.
  • Animebook
    • Cross platform (chromium) browser video player.
    • Codec support is based on the browser used.
  • All of the above require subtitle files to function. See here and/or here for some websites where you can get subtitles from.
  • One challenge for video content is that subtitles are usually not aligned properly if the subtitles are downloaded separately from the video. I've always used a combination of mkvextract (to extract the subtitle file from the .mkv file) and alass (to align the native subtitles with reference subtitles, usually in a different language) to get the job done. If you want more options, see this page.

Other:

  • jidoujisho
  • Immersive
    • Add-on for MPV. Alternative to mpvacious.
    • WARNING: This is potentially outdated and/or abandoned. The most recent commit as of writing (2022/10/19) was done in 2022/01/27. This is listed here for completeness only.
Guides (click here)

Manga: Getting Text

mokuro (recommended)

mokuro pre-processes manga, so you don't have to run any OCR program afterwards.

Guides:

  • Lazy guide (recommended)
    • (For Windows users) Make sure to check the "Add Python to Path" on install.
    • If you are using online processing (google colab), be sure that you are using the gpu to speed up the process.
  • Josuke's mokuro setup guide
    • Contact info: Josuke#7212 on the Refold (JP) Discord server
    • This doesn't include instructions on how to process online (whereas the Lazy guide does)

Other Resources:

If any error occurs, check the following:

  • Check your Python version (python --version, or python3 --version). Python 3.10 is not supported yet.

    If your Python version is too old, I recommend using pyenv (for Linux users). Linux users can use the automatic installer. For Windows users, it should be sufficient to uninstall mokuro, install a newer version of Python, and then re-install mokuro with the newer version.

  • Make sure your directory is a string and not a number. For example, mokuro ./01 on unix, and mokuro .\01 on Windows.

Manga OCR

Manga OCR allows you to automatically OCR any image. As the name suggests, this works best on manga.

Guides:


Books (EPUBs, HTMLZ, PDF)

As long as you're not using a scan (image-based), the text should already be available. Below will list a few ways to view these files in a browser to Yomichan.

Resources (click here)

Other:

Guides (click here)

Getting Images & Sentence Audio Manually

Sometimes, there is no easy way to get the image and sentence audio other than with a screen recorder. The primary example for this is game-like content.

Here are the two popular approaches to automatically adding the image and sentence audio:

ShareX (Windows)

ShareX

  • Windows media recorder which can both take screenshots and record audio. Personally tested.

Guides:

ames (Linux)

ames

  • ShareX alternative for Linux. Personally tested.
  • Primarily used to automate audio and picture extraction to the most recently added Anki card.

Resource Lists

Other websites have significantly larger resource lists that may prove useful for you.

Resource Lists (click here)

Notes on Various Programs

mpvacious

  • You will have to change the configuration in order for mpvacious to work with JPMN.

    Click here to see some basic config changes to get it working with JPMN.
    # Be sure to change deck_name to whatever your deck is!
    
    # Model names are listed in `Tools -> Manage note types` menu in Anki.
    model_name=JP Mining Note
    
    # Field names as they appear in the selected note type.
    # If you set `audio_field` or `image_field` empty,
    # the corresponding media file will not be created.
    sentence_field=Sentence
    #secondary_field=SentEng  # Not used by the note. This is ignored entirely.
    audio_field=SentenceAudio
    image_field=Picture
    

    You may want to increase the picture and audio quality, as it's extremely low by default. I personally use the following:

    # Sane values are 16k-32k for opus, 64k-128k for mp3.
    audio_bitrate=32k
    
    # Quality of produced image files. 0 = lowest, 100=highest.
    #snapshot_quality=15
    snapshot_quality=50
    
    # Image dimensions
    # If either (but not both) of the width or height parameters is -2,
    # the value will be calculated preserving the aspect-ratio.
    #snapshot_width=-2
    #snapshot_height=200
    snapshot_width=800
    snapshot_height=-2
    

    Info

    When creating the config file, ensure that the config file is placed in the correct folder. This script-opts folder does not exist by default. You will likely have to create the folder.

    Additionally, be sure to restart MPV after changing the config to apply the changes.

  • A common issue with mpvacious is that the SentenceReading field may differ from the Sentence field, say, if you export multiple subtitles into one card. See the FAQ on how to fix it.

  • To create cards with mpvacious, first add a card from Yomichan (usually via a texthooker), and then press ctrl+m in mpv.


asbplayer

  • To use asbplayer, add the card with Yomichan, and then update the created note with asbplayer. I recommend filling out the following fields as follows:

    asbplayer fields (click here)
    asbplayer field JPMN field
    Sentence Field Sentence
    Definition Field
    Word Field
    Audio Field SentenceAudio
    Image Field Picture
    Source Field AdditionalNotes
    URL Field AdditionalNotes

    Note

    Chances are that you are using subtitles. However, if you are not using subtitles, it is fine to keep the Sentence Field empty.

  • Any version of asbplayer released after 2023/01/16 (version 0.25.0 or higher) will now preserve the bolded word within the sentence! However, asbplayer shares the same common issue with mpvacious, where the SentenceReading field may differ from the Sentence field. See the FAQ on how to fix it.


jidoujisho

I'm not very sure how the Anki card generation works for this app, since this app does not use Yomichan.

The custom handlebars used by JPMN does a lot of heavy lifting and has plenty of customizations specifically to work JPMN. Unfortunately, this handlebars is not very portable between programs.

If you want to use this app, I leave it to the user to figure out the specifics of creating the cards.


Last update: April 2, 2023