Koshort: Korean trends streaming in python¶
Koshort is a Python package for Korean internet trends streaming and processing… or maybe abbreviation of Korean domestic cat.
For step-by-step instructions, follow the tutorial and explore examples. For specific descriptions of each module, go see the api documents.
Lowering the barrier to the domain-specific internet corpus¶
Social network services and other internet communities are open and rich data source of human spoken language. But due to the issues of privacy and policy of each website, sharing a bunch of retrieved text data is normally prohibited. To solve the most major Natural Language Processing (NLP) problem under this circumstances, researchers had to rely on limited public datasets and data brought by their company. Otherwise they would implement their domain-specific crawler for each case.
Koshort is hardly inspired by the project KoNLPy, with similar philosophy. It is not about recreating another crawler but to unify efforts around so that anyone can accelerate their projects.
Use out-of-box script
> stream_naver
display_rank = False
filename = trends.txt
interval = 60
n_limits = 10
verbose = True
시크릿 마더
무법변호사
신아영
미얀마
로드fc
소진
위너
불후의명곡
그것이 알고싶다
짠내투어
아는형님
로또
로또806회
msi
전지적 참견 시점
김재훈
아이돌룸
토익
아오르꺼러
같이 살래요
Use koshort API
>>> from koshort.stream import NaverStreamer
>>> streamer = NaverStreamer()
>>> streamer.stream()
cj채용
온주완의 뮤직쇼
유상무
현대차
...
Contribute¶
Koshort is just another newly created library. It can continuously evolve and we need your help!
Found a bug? Do you have a good idea for improving koshort? Visit Koshort GitHub page and suggest an idea or make a pull request.
You are also welcome to join our discord
Please note that asking questions through these channels is also a great contribution, Because it gives the community feedback as well as ideas. Please don’t hesitate to ask.