Backport #29297 by @KN4CK3RFixes#29101
Related #29298
Discard all read data to prevent misinterpreting existing data. Some
discard calls were missing in error cases.
Co-authored-by: KN4CK3R <admin@oldschoolhack.me>
Co-authored-by: yp05327 <576951401@qq.com>
Fix#24896
If users set different languages by `linguist-language`, the `stats` map
could be: `java: 100, Java: 200`.
Language stats are stored as case-insensitive in database and there is a
unique key.
So, the different language names should be merged to one unique name:
`Java: 300`
Change all license headers to comply with REUSE specification.
Fix#16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
Close#20315 (fix the panic when parsing invalid input), Speed up #20231 (use ls-tree without size field)
Introduce ListEntriesRecursiveFast (ls-tree without size) and ListEntriesRecursiveWithSize (ls-tree with size)
* clean git support for ver < 2.0
* fine tune tests for markup (which requires git module)
* remove unnecessary comments
* try to fix tests
* try test again
* use const for GitVersionRequired instead of var
* try to fix integration test
* Refactor CheckAttributeReader to make a *git.Repository version
* update document for commit signing with Gitea's internal gitconfig
* update document for commit signing with Gitea's internal gitconfig
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
It appears possible that there could be a hang due to unread data from the
repo-attribute command pipes. This PR simply closes these during the defer.
Signed-off-by: Andrew Thornton <art27@cantab.net>
This PR registers requests with the process manager and manages hierarchy within the processes.
Git repos are then associated with a context, (usually the request's context) - with sub commands using this context as their base context.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Use check attribute code to check the assigned language of a file and send that in to
chroma as a hint for the language of the file.
Signed-off-by: Andrew Thornton <art27@cantab.net>
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Ignore Sync errors on pipes when doing `CheckAttributeReader.CheckPath`
* apply env patch
* Drop the Sync and fix a number of issues with the Close function
Signed-off-by: Andrew Thornton <art27@cantab.net>
* add logs for DBIndexer and CheckPath
* Fix some more closing bugs
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add test case for language_stats
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Update modules/indexer/stats/db.go
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Replaces #16262
Replaces #16250
Replaces #14833
This PR first implements a `git check-attr` pipe reader - using `git check-attr --stdin -z --cached` - taking account of the change in the output format in git 1.8.5 and creates a helper function to read a tree into a temporary index file for that pipe reader.
It then wires this in to the language stats helper and into the git diff generation.
Files which are marked generated will be folded by default.
Fixes#14786Fixes#12653
remove log() func from gogs times and switch to proper logging
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Andrew Thornton <art27@cantab.net>
* Improve get last commit using git log --name-status
git log --name-status -c provides information about the diff between a
commit and its parents. Using this and adjusting the algorithm to use
the first change to a path allows for a much faster generation of commit
info.
There is a subtle change in the results generated but this will cause
the results to more closely match those from elsewhere.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lauris BH <lauris@nix.lv>
`enry.IsVendor` is kinda slow as it simply iterates across all regexps.
This PR ajdusts the regexps to combine them to make this process a
little quicker.
Related #15143
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Extract out the common cat-file batch calls
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Move bleve and elastic indexers to use a common cat-file --batch when indexing
Signed-off-by: Andrew Thornton <art27@cantab.net>
* move catfilebatch to batch_reader and rename to batch_reader.go
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
* Use cat-file --batch in GetLanguageStats
This PR moves to using a single cat-file --batch in GetLanguageStats
significantly reducing the number of processes spawned during language stat
processing.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* placate lint
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Update modules/git/repo_language_stats_nogogit.go
Co-authored-by: a1012112796 <1012112796@qq.com>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: a1012112796 <1012112796@qq.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
* Move last commit cache back into modules/git
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Remove go-git from the interface for last commit cache
Signed-off-by: Andrew Thornton <art27@cantab.net>
* move cacheref to last_commit_cache
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Remove go-git from routers/private/hook
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Move FindLFSFiles to pipeline
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Make no-go-git variants
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Submodule RefID
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix issue with GetCommitsInfo
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix GetLastCommitForPaths
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Improve efficiency
Signed-off-by: Andrew Thornton <art27@cantab.net>
* More efficiency
Signed-off-by: Andrew Thornton <art27@cantab.net>
* even faster
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Reduce duplication
* As per @lunny
Signed-off-by: Andrew Thornton <art27@cantab.net>
* attempt to fix drone
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix test-tags
Signed-off-by: Andrew Thornton <art27@cantab.net>
* default to use no-go-git variants and add gogit build tag
Signed-off-by: Andrew Thornton <art27@cantab.net>
* placate lint
Signed-off-by: Andrew Thornton <art27@cantab.net>
* as per @6543
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Change language statistics to save size instead of percentage in database
Co-Authored-By: Cirno the Strongest <1447794+CirnoT@users.noreply.github.com>
* Do not exclude if only language
* Fix edge cases with special langauges
Co-authored-by: Cirno the Strongest <1447794+CirnoT@users.noreply.github.com>
Move langauge detection to separate module to be more reusable
Add option to disable vendored file exclusion from file search
Allways show all language stats for search
* Implementation for calculating language statistics
Impement saving code language statistics to database
Implement rendering langauge stats
Add primary laguage to show in repository list
Implement repository stats indexer queue
Add indexer test
Refactor to use queue module
* Do not timeout for queues