Show HN: Turn native language audio into flashcards and shadowing practice

(lingochunk.com)

32 points | by alder 3 hours ago

8 comments

pzagor2 10 minutes ago
I also built a tool to help me study Spanish. I really like the idea of shadowing, so I built a tool that lets you take any YouTube video and generate a sentence-by-sentence exercise to help you repeat the speaker's phrases.
https://talkhabit.com/shadow Or example, of one exercise: https://talkhabit.com/shadow?videoUrl=https%3A%2F%2Fwww.yout...
Stuff I need to work on: - It only works with videos that have auto-generated captions - It works best with monologue videos
__float 1 hour ago
I don't know what resolution or display you built this on, but a heads up the initial impression on my 4K monitor is that everything is incredibly tiny.
[-]
- alder 51 minutes ago
  To be honest I haven't tested it on a 4K monitor yet, so I am not surprised. There are two controls above the transcript that change the font size and the line spacing, which should help a bit for now. Something to fix, thanks!
hiAndrewQuinn 1 hour ago
Very nice work. I'm going for a different thing, but my audio2anki tool [1] is about as streamlined as I could make it to turn a YouTube URL I want to learn into a stack of Anki flashcards, purely locally.
[1]: https://github.com/hiAndrewQuinn/audio2anki
jcg591 35 minutes ago
Very cool! I'm also learning Greek and it's amazing how many resources are becoming available.
[-]
- alder 19 minutes ago
  Thanks! Yes, it's getting better for Greek but still not on par with other languages. I completed the only 2 Greek levels on Duolingo and they are really boring compared to the German one I am doing now. Easy Greek is a bit above my level, and the number of YouTubers in Greek is tiny compared to German.
jrrv 1 hour ago
Is it possible to add traditional characters for mandarin?
Also the pinyin for 誰/谁 is coming through as shuí, whilst this character has two pronounciations, I believe shéi is the more common one.
[-]
- alder 30 minutes ago
  Thanks! Chinese and Japanese as source languages are still experimental, I did my best to support them but I have to rely on people who actually know the language and this kind of feedback is really useful. I'll look into adding traditional characters and fixing the pinyin.
  [-]
  - jrrv 25 minutes ago
    No worries, I appreciate the effort. I did go back and listen and they are indeed pronouncing sheí in the audio too.
    I use a firefox extension to convert simplified to traditional, looks like it's open source so that may be of some use to you: https://github.com/tongwentang/tongwentang-extension.
    Although there are some clashes that it does not handle, e.g. 隻 and 只 are both 只 in simplified, you just have to know which one it is from context, but the extension fails to convert to 隻 where appropriate.
Koaisu 57 minutes ago
Just tried it with an unsupported language and it still worked I set it to Chinese and inputted the audio. Still got correct results.
3stacks 1 hour ago
This is awesome! I’ll be lurking for new data sources. I’m working on a self-hosted language app more focused around cloze and sentence mining into Anki. I love seeing more stuff happening in this space
[-]
- alder 56 minutes ago
  Thanks! I am glad you like it! I essentially mine the source audio, and all examples have cloze style gaps (blurring, in my case) that are revealed on the back of the card. I also beep the word in the sentence when you try to play it on the front card in built-in SRS system. Unfortunately that is not implemented in the Anki export, but it is technically possible.
dirteater_ 42 minutes ago
What are you doing for Chinese word segmentation/pinyin?
[-]
- alder 0 minutes ago
  For segmentation and POS I rely on spaCy zh_core_web_sm, pinyin from pypinyin library. Also the small correction level on top. But I am not a Chinese language expert to judge if it really works and I'll rely on feedback from the users to improve it.