Nantes Université

Skip to content
Extraits de code Groupes Projets
Valider da7cf576 rédigé par Richard Zanibbi's avatar Richard Zanibbi
Parcourir les fichiers

Major update Oct 1-11, 2014 (RZ). Modifications summarized in "CHANGES"

file.
parent f1414bff
Aucune branche associée trouvée
Aucune étiquette associée trouvée
Aucune requête de fusion associée trouvée
CHANGES File
Oct. 1-9, 2014
R. Zanibbi
Major work on the library recently, including:
- Changed metric output format for evallg.py, adding information for
detection f-measures; structure, object and relation detection and
detection + classification rates; relative correc. class/detection rates
- Modified file outputs for evaluate script:
- Now creating a human-readable spreadsheet (.csv) will all raw metrics,
to faciliate analysis in spreadsheet programs, statistical packages, etc.
- Producing list of processed files with Correct/Incorrect indication
- Separate directory for .dot graphs for files with errors
- Confusion matrix file now has a separate plot for just relationship vs.
relationship confusions on edges, along with the 'full' edge
label confusion matrix.
* Significant effort to clarify correct merge but wrong class vs.
segmentation disagreements in the Summary.txt and ConfusionMatrix.html
files (in Summary, for directed and undirected edges).
- Created scripts to automate relabeling 'old' N/E format CROHME files
using '*' for merge edges (relabelEdges and relabelOldCROHME in bin/)
- Debugging of metric computations
- D_S (merge edge rate) was incorrect when segments are defined
inconsistently (e.g. only 'merge' in one direction, vs. both).
Corrected.
- Removed computation of percentage metrics to reduce raw result
spreadsheet. These are easily computed, along with descriptive
statistics and distributions in standard spreadsheets/statistical
packages.
- Various small errors related to when files are empty, have missing
primitives, or have additional primitives within them.
- lg.csvObject() function will write out lg data in the new O/R format,
with a brief explanation of the format in the file.
- Final touches on debugging/refining lg2dot, which I modified significantly
in early September to produce more readable output that is more consistent
across the different graph types.
- Created file summarizing 'raw' metric data types and key data structures
in lg.py (which tend to be hard to remember).
- **Added ability to use 'evaluate' or 'confHist' with either a pair
of directories, or a list of files.
- Added lgOR and lgNE converters, so that files can be pretty printed
and/or converted easily between formats. Objects are still identified
by common labels on edges and nodes (i.e. '*' for merging primitives
is deprecated, although still accepted at input time).
- Added cliff, ldiff and vdiff tools for finding instances/file with common
errors using regular expression matching over .diff files produced by the
'evaluate' script.
- *Debugged lg2txt.py, which was not working properly with some labels
(in particular, for square roots using 'Inside' relationships). Also
updated the translate/mathMLMap.csv file accordingly.
- Added ability to export a file list from structure confusion histogram output.
License and Copyright
The LgEval evaluation toolkit is Copyright (c) 01/06/2013, Richard Zanibbi
The LgEval evaluation toolkit is Copyright (c) 01/06/2013, 01/10/2014 Richard Zanibbi
and Harold Mouchère. LgEval is free software; you may redistribute it and/or
modify it under the terms of the Creative Commons CC BY-NC-SA 3.0
(Attribution-NonCommercial-ShareAlike 3.0 Unported)
......
Ce diff est replié.
Ce diff est replié.
-------------------------------------------------------------------
README: Summary of Metrics and Key Data Structures
LgEval 0.3.2
First Version: Oct. 10, 2014 (R. Zanibbi)
Copyright (c) 2014 Richard Zanibbi and Harold Mouchere
-------------------------------------------------------------------
LgEval uses a simple file-based approach to evaluating individual files and
collections of files.
The program src/evallg.py provides the main functions from which metrics are
computed for each input file, producing a metrics file (<infile>.csv) and a
file specific differences between the input and its target (<infile>.diff,
another .csv file).
When using the 'evaluate' script, metrics and differences for individual files
are written to a <Results_Dir>/Metrics directory. These individual results are
concatenated and then used to produce the raw results spreadsheet and summary
files output by the 'evaluate' script.
:: Adding Metrics ::
Metrics may be added by modifying the lists of named metrics in the functions
lg.compare() and lg.compareSegments() in src/lg.py. Once added to one of these
lists, they will automatically be computed and compiled when using the
'evaluate' script.
DIFF FILE FORMAT (.diff)
---------------------------
.diff files are in CSV format, representing one pair of disagreeing labels per
line.
Node label errors:
*N, Primitive ID, OutputLabel, OutputWeight, :vs:, TargetLabel, TargetWeight
Edge label errors (single line in .diff file):
*E, Primitive ID (parent), Primitive ID (child), OutputLabel, OutputWeight, :vs:,
TargetLabel, TargetWeight
Segmentation errors showing primitive pair where output and target disagree on
whether the two primitives belong to the same object:
*S, PrimitiveID (parent), PrimitiveID (child)
NOTE: because primitive merges are represented by bidirectional edges, normally
if *S,1,2 appears in a .diff file, *S,2,1 will also appear in the file.
METRIC DESCRIPTIONS
--------------------------------------------------
Metrics have been organized into groups below. Please note that they currently
do not appear in this order in 00_RawResults.csv.
Targets
---------
nNodes Number nodes in ground truth (comparison) file
nEdges Number edges in ground truth (comparison) file
nSeg Number objects (segments) in ground truth (comparison)
nSegRelEdges Number object relationships in ground truth (comparison)
Detections
-----------
detectedSeg Number objects detected in input file
dSegRelEdges Number object relationships in input file
Primitive Metrics
-------------------
D_B Number of incorrect node and edge labels
D_C Number of incorrect node/primitive labels
D_L Number of incorrect edge labels
D_R Number of incorrect relationship edges
D_S Number of incorrect 'merge/object' edges
D_E(%) Weighted accuracy over node and edge labels
(see DRR 2013 paper)
nodeCorrect Number primitives (nodes) correctly labeled
segPairErrors Number incorrectly merged/split primitive pairs
edgeDiffClassCount Number valid directed merge edges with incorrect object type
undirDiffClassCount Number valid undirected merge edges with incorrect object type
dPairs Number incorrect *undirected* node pairs
Object Metrics
------------------
CorrectSegRelLocations Number correct relationship locations
CorrectSegRels Number correctly located and labeled relationships
CorrectSegments Number correctly segmented objects
CorrectSegmentsAndClass Number correctly segmented and labeled objects
ClassError Number incorrectly classified objects
SegRelErrors Number incorrectly detected segment relationships
Flags (1/0)
-------------
hasCorrectSegments Object locations are correct
hasCorrectSegLab Object locations and labels correct
hasCorrectRelationLocations Object relationship locations correct
hasCorrectRelLab Object relationship locations correct
hasCorrectStructure Object *and* relationship locations correct
LGEVAL KEY DATA STRUCTURES
------------------------------------
Label Graph Attributes (see lg.py)
-----------------------------------
lg.error Error flag
lg.cmpNodes Function used to compare node labelings
(default: Hamming distance, #disagreeing labels)
lg.cmpEdges Function used to compare edges labelings
(default: Hamming distance)
lg.nlabels Dictionary from node (primitive) identifiers to
another dictionary mapping labels to confidence values
(floating point values)
lg.elabels Dictionary from primitive pairs (edges) to
another dictionary mapping labels to confidence values
(floating point values)
lg.absentNodes Set of identifiers not present in the lg relative
to another lg (e.g. relative to ground truth)
Graph Segment Output (output of lg.segmentGraph())
----------------------------------------------------
segmentPrimitiveMap Dictionary from object identifier to a pair (a,b)
a: list of labels; b: list of primitives in the object
primitiveSegmentMap Dictionary from primitive identifier to
another dictionary mapping a label to the object id
associated with the primitive for this relationship type
rootSegments Set of objects with no incoming edges
segmentEdges Dictionary from pairs of object identifiers to
another dictionary mapping relationship labels to
confidence values
'compareSegments' output (lg.compareSegments())
------------------------------------------------
edgeDiffCount Number of disagreeing 'merge/object' edges
segDiffs Dictionary mapping primitives to pairs of
primitives belonging to objects ( diff1, diff2 )
correctSegments Set of (Obj id, label) pairs for correct segments
metrics List of metric (name, value) pairs - these are a
subset of the metrics described above
primRelEdgeDiffs Dictionary mapping primitive pairs (edges) to an
error entry (should probably be simplified)
'compare' output (lg.compare())
---------------------------------
metrics All metrics described above
nodeconflicts List of pairs (node id, [ (l1, 1.0), (l2, 1.0) ])
where l1 and l2 are disagreeing labels
edgeconflicts List of pairs ( (nid_1, nid_2, [ (l1, 1.0), (l2, 1.0) ])
where l1 and l2 are disagreeing labels
segDiffs Produced by compareSegments() (see above)
correctSegs Produced by compareSegments() (see above)
segRelDiffs Same as primRelEdgeDiffs from compareSegments()
(see above)
#!/bin/bash
if [ $# -lt 3 ]
then
echo "LgEval cdiff: Compile Errors from Diff Files"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: cdiff [-NESC^] outputPattern targetPattern <files>"
echo ""
echo "Return all lines (instances) of errors in diff files for"
echo "nodes or edges matching the provided patterns (egrep-format"
echo "regular expressions). Matching lines are written on standard output."
echo ""
echo "*Note: the pattern 'any' will match any label."
echo ""
echo "A single flag list token indicates whether to limit matches to"
echo "(N)ode label errors, (E)dge label errors, and/or files with"
echo "(S)egmentation errors or only (C)orrect segmentations. Including"
echo "^ in the flag list token will return files that do not match the"
echo "passed patterns."
exit 0
fi
# By default, return lines matching the given patterns.
FLIST=""
# Create initial list of all .diff files.
CFILES=""
FLAGS=""
if [[ $1 == -* ]]
then
# Grab flag string and shift.
FLAGS=$1
shift
fi
# Grab the patterns; shift to file list.
OUTP=$1
TARP=$2
shift
shift
# Take CFILES (current files) as all passed .diff files.
CFILES="$@"
# Indicate if we want the complement of a match.
if [[ $FLAGS == *^* ]]
then
FLIST="-v"
fi
# Note that segment error/correct seg are exclusive.
if [[ $FLAGS == *S* ]]
then
CFILES=`grep -l "^*S" $@`
elif [[ $FLAGS == *C* ]]
then
# -L flag selects inverse of the pattern ('negates' it)
CFILES=`grep -L "^*S" $@`
fi
# One or more non-comma characters: reg expression for 'any label' (*)
# Create the pattern string to use with grep, then run the filter and
# obtain the list of matching files.
ANYLABEL="[^,][^,]*"
MID=",1.0,:vs:,"
OUTLABEL=$ANYLABEL
TARLABEL=$ANYLABEL
if [ "$OUTP" != "any" ]
then
OUTLABEL="$OUTP"
fi
if [ "$TARP" != "any" ]
then
TARLABEL="$TARP"
fi
PATTERN="$OUTLABEL$MID$TARLABEL"
# Note that Node/Edge filtering is also exclusive.
if [[ $FLAGS == *N* ]]
then
PATTERN="^\*N.*$PATTERN"
elif [[ $FLAGS == *E* ]]
then
PATTERN="^\*E.*$PATTERN"
fi
# Use extended regular expressions to ease usage.
# FLIST allows the complement to be returned if desired.
CFILES=`grep $FLIST -E "$PATTERN" $CFILES`
# Write the matching file names on standard output.
for file in $CFILES
do
echo `basename $file`
done
#!/bin/bash
if [ $# -lt 1 ]
then
echo "LgEval confHist: Structure Confusion Histogram Generator"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2013-2014"
echo ""
echo "Usage: confHist dir1 dir2 [minCount] [strokes] OR"
echo " confHist fileList [minCount] [strokes]"
echo ""
echo "Creates an .html file containing structure confusion histograms"
echo "at the object level. The histograms visualize errors by their"
echo "frequency when comparing files in dir1 vs. dir2 (dir2 is 'ground truth')."
echo ""
echo "If a file list is provided, then each line of the file"
echo "(where each line is: 'inputfile outputfile') is used for comparisons."
echo ""
echo "minCount is the minimum number of times an error should occur before"
echo "detailed information is provided in the confusion histogram. By default,"
echo "all errors are shown (minCount = 1)."
echo ""
echo "If an optional argument is provided (<strokes>), then stroke"
echo "confusion histograms will be constructed in addition to object"
echo "confusion histograms."
echo ""
echo "Output is written to the file CH_<dir1>.html or"
echo "CH_<fileList>.html, depending upon the arguments used."
exit 0
fi
if [ -d $1 ]
then
# Two directories passed (hopefully).
# NOTE: Assumes same number of .lg files.
ls $1/*.lg > _f1
ls $2/*.lg > _f2
paste -d" " _f1 _f2 > d_$1
rm -f _f1 _f2
INFILE=d_$1
shift
python $LgEvalDir/src/confHists.py $INFILE $@
rm d_$1
else
# User-provided file list.
python $LgEvalDir/src/confHists.py $@
fi
exit 0
Ce diff est replié.
......@@ -10,103 +10,189 @@
# in your .bashrc file (the initialization file for bash shell). The PATH
# alteration will add the tools to your search path.
if [ $# -lt 2 ]
if [ $# -lt 1 ]
then
echo "LgEval Label graph evaluation tool"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo "LgEval evaluate: Label graph evaluation tool"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: evaluate outputDir groundTruthDir [t/d/s/b]"
echo "Usage: evaluate outputDir groundTruthDir [p/t/d/s/b] OR"
echo " evaluate fileList [p/t/d/s/b]"
echo ""
echo "Evaluates all label graph (.lg) files in outputDir against"
echo "corresponding files in groundTruthDir. groundTruthDir is used"
echo "to generate the list of files to be compared (i.e. if a file is"
echo "not in the ground truth directory, it will not be considered)."
echo ""
echo "If a list of files is provided instead ('output target' on each line)"
echo "then these file pairs are used for evaluation."
echo ""
echo "Outputs"
echo "-----------------------------"
echo " Results<outputDir>/"
echo " Summary : summary of performance metrics"
echo " Correct : list outputDir files matching ground truth"
echo " Metrics.m : metrics for all .lg files compared"
echo " Diffs.diff : all differences between files"
echo " ConfusionMatrix.html : node and edge label confusion matrix"
echo " (errors only)"
echo " Results<outputDir/fileListName>/"
echo " 00_RawResults.csv: raw metrics spreadsheet"
echo " 00_Summary.txt: summary of performance metrics"
echo " ConfusionMatrices.csv/.html: confusion matrix spreadsheet (errors)"
echo " FileResults.csv: all evaluated files, with correct/incorrect indication"
echo ""
echo " Metrics/ : directory with .m (metric) and .diff (difference) file for"
echo " each comparison. .dot (GraphViz) and .pdf files are generated for"
echo " viewing differences between files if a third argument is provided."
echo " Metrics/: directory with .csv (metric) and .diff (difference) files"
echo " graphErrors/: if dot output requested, visualizations for files with"
echo " errors are stored here (.dot and .pdf format)."
echo ""
echo "NOTE: the different visualizations of structural differences are described"
echo " if you run lg2dot without arguments (object (t)ree; (d)irected graph"
echo " over objects; primitive (s)egmentation graph; (b)ipartite graph over"
echo " primitives."
echo " primitives; (p): default directed graph over primitives."
exit 0
fi
dir=$1
DOTARG=""
BNAME=`basename $1`
truthDir=$2
ResultsDir=Results_$BNAME
MODE="Dir"
TARGETS=""
OUTPUTS=""
if ! [ -d $1 ]
then
echo "File List: $1"
MODE="List"
# Get the targets
OUTPUTS=`awk '{ print $1; }' $1`
OUTARR=($OUTPUTS)
TARGETS=`awk '{ print $2; }' $1`
if [ $# -gt 1 ]
then
DOTARG=$2
fi
else
echo "Output File Directory: $1"
echo "Ground Truth Directory: $2"
TARGETS=`ls $2/*.lg`
if [ $# -gt 2 ]
then
DOTARG=$3
fi
fi
echo ""
ResultsDir=Results_$BNAME
if ! [ -d $ResultsDir ]
then
mkdir $ResultsDir
mkdir $ResultsDir/Metrics
if [ "$DOTARG" != "" ]
then
mkdir $ResultsDir/errorGraphs
mkdir $ResultsDir/errorGraphs/dot
mkdir $ResultsDir/errorGraphs/pdf
fi
fi
echo "Output File Directory: $1"
echo "Ground Truth Directory: $2"
echo ""
# Compute all .m metrics outputs (per-file), and .diff results (per-file).
# Compute all .csv metrics outputs (per-file), and .diff results (per-file).
echo "Evaluating files..."
PREFIX=res_
for file in `ls $truthDir/*.lg`
INDEX=0
for file in $TARGETS
do
FNAME=`basename $file .lg`
nextFile=$dir/$FNAME.lg
if [[ ! -e $ResultsDir/Metrics/$FNAME.m ]]
nextFile="_ERROR_"
if [ $MODE == "Dir" ]
then
nextFile=`echo "$1/$FNAME.lg" | perl -p -i -e "s/\/\//\//g"`
else
# Index to the next input file.
nextFile=${OUTARR[INDEX]}
fi
if [[ ! -e $ResultsDir/Metrics/$FNAME.csv ]]
then
# NOTE: the script convertCrohmeLg can be used to convert
# crohme .inkml files to .lg files.
echo " >> Comparing $FNAME.lg"
python $LgEvalDir/src/evallg.py $nextFile $file m > $ResultsDir/Metrics/$FNAME.m
DIFF=`python $LgEvalDir/src/evallg.py $nextFile $file diff`
python $LgEvalDir/src/evallg.py $nextFile $file m INTER > $ResultsDir/Metrics/$FNAME.csv
DIFF=`python $LgEvalDir/src/evallg.py $nextFile $file diff INTER`
CORRECT="Correct"
if [ -n "$DIFF" ]
then
echo "$DIFF" > $ResultsDir/Metrics/$FNAME.diff
# Record files with errors in the "Errors" file.
#echo "$nextFile" >> $ResultsDir/FilesIncorrect.txt
# If a third argument is provided, generate a .pdf file to visualize
# differences between graphs.
if [ $# -gt 2 ]
if [ "$DOTARG" != "" ]
then
lg2dot $nextFile $file $3
mv $FNAME.dot $FNAME.pdf $ResultsDir/Metrics/
if [ "$DOTARG" == "p" ]
then
lg2dot $nextFile $file
else
lg2dot $nextFile $file "$DOTARG"
fi
mv $FNAME.dot $ResultsDir/errorGraphs/dot
mv $FNAME.pdf $ResultsDir/errorGraphs/pdf
fi
CORRECT="Incorrect"
else
rm -f $ResultsDir/Metrics/$FNAME.diff
echo "$nextFile" >> $ResultsDir/Correct
# Record correct files in the "Correct" file.
#echo "$nextFile" >> $ResultsDir/FilesCorrect.txt
fi
# Add record of evaluating the file.
echo "$nextFile, $CORRECT" >> $ResultsDir/FileResults.csv
else
echo " Already processed: $file"
fi
INDEX=$((INDEX+1))
done
# Compile all metrics/diffs,
# and then compute metric summaries and confusion matrices.
cat $ResultsDir/Metrics/*.m > $ResultsDir/Metrics.m
cat $ResultsDir/Metrics/*.csv > $ResultsDir/$BNAME.csv
ALLDIFFS=`ls $ResultsDir/Metrics | grep .diff`
if [ -n "$ALLDIFFS" ]
then
cat $ResultsDir/Metrics/*.diff > $ResultsDir/Diffs.diff
cat $ResultsDir/Metrics/*.diff > $ResultsDir/$BNAME.diff
else
touch $ResultsDir/__NoErrors
touch $ResultsDir/Diffs.diff # empty - no errors.
touch $ResultsDir/00_NoErrors
touch $ResultsDir/$BNAME.diff # empty - no errors.
fi
python $LgEvalDir/src/sumMetric.py $ResultsDir/Metrics.m > $ResultsDir/__Summary
python $LgEvalDir/src/sumDiff.py $ResultsDir/Diffs.diff html > $ResultsDir/ConfusionMatrix.html
python $LgEvalDir/src/sumMetric.py $ResultsDir/$BNAME.csv > $ResultsDir/00_Summary.txt
python $LgEvalDir/src/sumDiff.py $ResultsDir/$BNAME.diff html > $ResultsDir/ConfusionMatrices.html
python $LgEvalDir/src/sumDiff.py $ResultsDir/$BNAME.diff > $ResultsDir/ConfusionMatrices.csv
# RZ Oct. 2014: Create spreadsheet pairing file names with metrics.
# Clean up raw metric data to make the file smaller and simpler.
# Use awk and head to select every odd (headers) and even (data) columns,
# Concatenate one header row with data contents.
awk -F',' '{ for (i=1;i<=NF;i+=2) printf ("%s%c", $i, i + 2 <= NF ? "," : "\n")}' $ResultsDir/$BNAME.csv > $ResultsDir/Headers.csv
# Obtain first row for data labels; insert a "File" label in the first column.
head -n 1 $ResultsDir/Headers.csv > $ResultsDir/HeaderRow.csv
HEAD=`cat $ResultsDir/HeaderRow.csv`
echo "File,Result,$HEAD" > $ResultsDir/HeaderRow.csv
awk -F',' '{ for (i=2;i<=NF;i+=2) printf ("%s%c", $i, i + 2 <= NF ? "," : "\n")}' $ResultsDir/$BNAME.csv > $ResultsDir/Data.csv
# Combine file names with raw data metrics, then add header labels.
paste -d , $ResultsDir/FileResults.csv $ResultsDir/Data.csv > $ResultsDir/DataNew.csv
cat $ResultsDir/HeaderRow.csv $ResultsDir/DataNew.csv > $ResultsDir/00_RawResults.csv
# Clean up.
rm -f $ResultsDir/Headers.csv $ResultsDir/HeaderRow.csv $ResultsDir/Data.csv
rm -f $ResultsDir/DataNew.csv
# Remove the compiled metrics and differences, but leave the individual metric/diff
# files in Metrics to support debugging for malformed or missing files, etc.
rm -f $ResultsDir/$BNAME.csv $ResultsDir/$BNAME.diff
echo "done."
#!/bin/bash
# Make sure that CROHMELibDir and LgEvalDir are defined in
# your shell enviroment, e.g. by including:
#
# export LgEvalDir=<path_to_LgEval>
# export CROHMELibDir=<path_to_CROHMELib>
# export PATH=$PATH:$CROHMELibDir/bin:$LgEvalDir/bin
#
# in your .bashrc file (the initialization file for bash shell). The PATH
# alteration will add the tools to your search path.
if [ $# -lt 2 ]
then
echo "LgEval evaluateMat: Label graph matrix evaluation (CROHME 2014)"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo ""
echo "Usage: evaluateMat outputDir groundTruthDir [t/d/s/b]"
echo ""
echo "Evaluates all label graph (.lg) files in outputDir against"
echo "corresponding files in groundTruthDir. groundTruthDir is used"
echo "to generate the list of files to be compared (i.e. if a file is"
echo "not in the ground truth directory, it will not be considered)."
echo "*The label graphs are filtered for different objects and"
echo " relationship types (e.g. for Rows and Columns), with"
echo " metrics compiled for the different levels/types of structure."
echo ""
echo "Outputs"
echo "-----------------------------"
echo " Results<outputDir>/"
echo " Summary : summary of performance metrics"
echo " Correct : list outputDir files matching ground truth"
echo " Metrics.csv : metrics for all .lg files compared"
echo " Diffs.diff : all differences between files"
echo " ConfusionMatrix.html : node and edge label confusion matrix"
echo " (errors only)"
echo ""
echo " Metrics/ : directory with .csv (metric) and .diff (difference) file for"
echo " each comparison. .dot (GraphViz) and .pdf files are generated for"
echo " viewing differences between files if a third argument is provided."
echo ""
echo "NOTE: the different visualizations of structural differences are described"
echo " if you run lg2dot without arguments (object (t)ree; (d)irected graph"
echo " over objects; primitive (s)egmentation graph; (b)ipartite graph over"
echo " primitives."
exit 0
fi
dir=$1
BNAME=`basename $1`
truthDir=$2
ResultsDir=Results_$BNAME
mkdir -p $ResultsDir/Metrics
mkdir -p $ResultsDir/MatMetrics
echo "Output File Directory: $1"
echo "Ground Truth Directory: $2"
echo ""
# Compute all .csv metrics outputs (per-file), and .diff results (per-file).
echo "Evaluating files..."
PREFIX=res_
for file in `ls $truthDir/*.lg`
do
FNAME=`basename $file .lg`
nextFile=$dir/$FNAME.lg
if [[ ! -e $ResultsDir/Metrics/$FNAME.csv ]]
then
# NOTE: the script convertCrohmeLg can be used to convert
# crohme .inkml files to .lg files.
echo " >> Comparing $FNAME.lg"
python $LgEvalDir/src/evallg.py $nextFile $file m > $ResultsDir/Metrics/$FNAME.csv
python $LgEvalDir/src/evallg.py $nextFile $file MATRIX $ResultsDir/MatMetrics/$FNAME
DIFF=`python $LgEvalDir/src/evallg.py $nextFile $file diff`
if [ -n "$DIFF" ]
then
echo "$DIFF" > $ResultsDir/Metrics/$FNAME.diff
else
rm -f $ResultsDir/Metrics/$FNAME.diff
echo "$nextFile" >> $ResultsDir/Correct
fi
else
echo " Already processed: $file"
fi
done
# Compile all metrics/diffs,
# and then compute metric summaries and confusion matrices.
cat $ResultsDir/Metrics/*.csv > $ResultsDir/Metrics.csv
ALLDIFFS=`ls $ResultsDir/Metrics | grep .diff`
if [ -n "$ALLDIFFS" ]
then
cat $ResultsDir/Metrics/*.diff > $ResultsDir/Diffs.diff
else
touch $ResultsDir/__NoErrors
touch $ResultsDir/Diffs.diff # empty - no errors.
fi
python $LgEvalDir/src/sumMetric.py $ResultsDir/Metrics.csv > $ResultsDir/__Summary
python $LgEvalDir/src/sumDiff.py $ResultsDir/Diffs.diff html > $ResultsDir/ConfusionMatrix.html
for typErr in Mat Cell Row Col Symb
do
cat $ResultsDir/MatMetrics/*${typErr}.csv > $ResultsDir/Metrics${typErr}.csv
python $LgEvalDir/src/sumMetric.py $ResultsDir/Metrics${typErr}.csv > $ResultsDir/_${typErr}_Summary
done
echo "done."
#!/bin/bash
if [ $# -lt 3 ]
then
echo "LgEval ldiff: List Files with Common Errors"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: ldiff [-NESC^] outputPattern targetPattern <files>"
echo ""
echo "Create a list of files that contain errors on label graph"
echo "nodes or edges matching the provided patterns (egrep-format"
echo "regular expressions). Matching files are written on standard output."
echo ""
echo "*Note: the pattern 'any' will match any label."
echo ""
echo "A single flag list token indicates whether to limit matches to"
echo "(N)ode label errors, (E)dge label errors, and/or files with"
echo "(S)egmentation errors or only (C)orrect segmentations. Including"
echo "^ in the flag list token will return files that do not match the"
echo "passed patterns."
exit 0
fi
# By default, return files matching the given patterns.
FLIST="-l"
# Create initial list of all .diff files.
CFILES=""
FLAGS=""
if [[ $1 == -* ]]
then
# Grab flag string and shift.
FLAGS=$1
shift
fi
# Grad the patterns; shift to file list.
OUTP=$1
TARP=$2
shift
shift
# Take CFILES (current files) as all passed .diff files.
CFILES="$@"
# Indicate if we want the complement of a match.
if [[ $FLAGS == *^* ]]
then
FLIST="-L"
fi
# Note that segment error/correct seg are exclusive.
if [[ $FLAGS == *S* ]]
then
CFILES=`grep -l "^*S" $@`
elif [[ $FLAGS == *C* ]]
then
# -L flag selects inverse of the pattern ('negates' it)
CFILES=`grep -L "^*S" $@`
fi
# one or more non-comma characters: reg expression for 'any label' (*)
# Create the pattern string to use with grep, then run the filter and
# obtain the list of matching files.
ANYLABEL="[^,][^,]*"
MID=",1.0,:vs:,"
OUTLABEL=$ANYLABEL
TARLABEL=$ANYLABEL
if [ "$OUTP" != "any" ]
then
OUTLABEL="$OUTP"
fi
if [ "$TARP" != "any" ]
then
TARLABEL="$TARP"
fi
PATTERN="$OUTLABEL$MID$TARLABEL"
# Note that Node/Edge filtering is also exclusive.
if [[ $FLAGS == *N* ]]
then
PATTERN="^\*N.*$PATTERN"
elif [[ $FLAGS == *E* ]]
then
PATTERN="^\*E.*$PATTERN"
fi
# Use extended regular expressions to ease usage.
# FLIST allows the complement to be returned if desired.
CFILES=`grep $FLIST -E "$PATTERN" $CFILES`
# Write the matching file names on standard output.
for file in $CFILES
do
echo `basename $file`
done
#!/bin/bash
if [ $# -lt 1 ]
then
echo "LgEval lg2NE: Node-Edge Label Graph Format Converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lg2NE <file.lg>"
echo ""
echo "Writes the contents of file.lg as an NE label graph file"
echo "on standard output."
exit 0
fi
python $LgEvalDir/src/lg2NE.py $1
#!/bin/bash
if [ $# -lt 1 ]
then
echo "LgEval lg2OR: Object-Relationship Label Graph Format Converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lg2OR <file.lg>"
echo ""
echo "Writes the contents of file.lg as an OR label graph file"
echo "on standard output."
exit 0
fi
python $LgEvalDir/src/lg2OR.py $1
......@@ -12,10 +12,10 @@
if [ $# -lt 1 ]
then
echo "LgEval Label graph to dot (GraphViz) converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo "LgEval lg2dot: Label graph to dot (GraphViz) converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lg2mml file1.lg [file2.lg] [graph_type]"
echo "Usage: lg2dot file1.lg [file2.lg] [graph_type]"
echo ""
echo "Converts a label graph file files to a .dot file,"
echo "which can then be converted to a .pdf, .png or other"
......@@ -26,8 +26,9 @@ then
echo "file) is visualized."
echo ""
echo "The graph_type argument may be one of the following:"
echo " - [default; no argument] a bipartite graph over strokes."
echo " - s : a bipartite segmentation graph (shows strokes in symbols)"
echo " - [default; no argument] a directed graph over strokes."
echo " - b : a bipartite graph over primitives"
echo " - s : a bipartite segmentation graph"
echo " - d : a directed acyclic graph over strokes"
echo " - t : a tree (NOTE: requires a valid hierachical structure)"
exit 0
......@@ -35,17 +36,7 @@ fi
BNAME=`basename $1 .lg`
if [ $# -eq 1 ]
then
python $LgEvalDir/src/lg2dot.py $1 > $BNAME.dot
elif [ $# -eq 2 ]
then
# Use tree output by default.
python $LgEvalDir/src/lg2dot.py $1 $2 > $BNAME.dot
else
python $LgEvalDir/src/lg2dot.py $1 $2 $3 > $BNAME.dot
fi
# Call dot and generate a .pdf file.
# Generate dot file, then call dot and generate a .pdf file.
python $LgEvalDir/src/lg2dot.py $@ > $BNAME.dot
dot -Tpdf $BNAME.dot -o $BNAME.pdf
......@@ -12,8 +12,8 @@
if [ $# -lt 2 ]
then
echo "LgEval Label graph to Label graph tree converter "
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo "LgEval lg2lgtree: Label graph to Label graph tree converter "
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lg2lgtree <indir> <outdir>"
echo ""
......
......@@ -12,12 +12,12 @@
if [ $# -lt 1 ]
then
echo "LgEval Label graph to text converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo "LgEval lg2mml: Label graph to Presentation MathML converter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lg2mml file.lg"
echo ""
echo "Converts a label graph file files to a MathML file,"
echo "Converts a label graph file to a MathML file,"
echo "written as file.mml to the current directory."
exit 0
fi
......
......@@ -12,8 +12,8 @@
if [ $# -lt 1 ]
then
echo "LgEval Label graph edge filter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2013"
echo "LgEval lgfilter: Label graph edge filter"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: lgfilter <infile> [outfile]"
echo ""
......
#!/bin/bash
# Make sure that CROHMELibDir and LgEvalDir are defined in
# your shell enviroment, e.g. by including:
#
# export LgEvalDir=<path_to_LgEval>
# export CROHMELibDir=<path_to_CROHMELib>
# export PATH=$PATH:$CROHMELibDir/bin:$LgEvalDir/bin
#
# in your .bashrc file (the initialization file for bash shell). The PATH
# alteration will add the tools to your search path.
if [ $# -lt 1 ]
then
echo "Usage: eval <outputDir> [groundTruthDir]"
echo ""
echo "Compares all .lg files in groundTruthDir to to matching"
echo "files (currently, with res_ prefix, and .inkml.lg sufix)."
echo ""
echo "Note: this script compares label graphs (.lg files)."
echo ""
echo "Output files:"
echo " - .m (metric) and .diff (difference) files for"
echo " each test file in outputDir"
echo " - ./Correct[outputDir] containing the list of correct"
echo " files from outputDir (i.e. matching groundTruth)"
echo " - ./Metrics[outputDir] compiling all metric values"
echo " - ./Diffs[outputDir] compiling all differences"
echo ""
echo " - ./Results[outputDir] summarizing performance metrics"
echo " - ./Diffs[outputDir].csv, providing node and edge label"
echo " confusion matrices (NOTE: currently errors only - no correct counts)"
exit 0
fi
dir=$1
echo "Evaluating recognizer output files in $dir..."
# Remove summary files.
rm -f Correct$dir Results$dir Diffs$dir Metrics$dir
# Compute all .m metrics outputs (per-file), and .diff results (per-file).
cd $dir
# Clean up old evaluation files
rm -f *.m *.diff
truthDir=testDataGT
if [ $# -gt 1 ]
then
truthDir=$2
fi
PREFIX=res_
#PREFIX="" # For use with ground truth data on itself.
for file in ../$truthDir/*.lg
do
nextFile=`basename $file`
python evallg.py $PREFIX$nextFile $file m > $nextFile.m
DIFF=`python evallg.py $PREFIX$nextFile $file diff`
if [ -n "$DIFF" ]
then
echo "$DIFF" > $nextFile.diff
else
echo "$PREFIX$nextFile" >> ../Correct$dir
fi
done
# Back to main directory - compile all metrics/diffs,
# and then compute metric summaries and confusion matrices.
cd ..
cat $dir/*.m > ./Metrics$dir
cat $dir/*.diff > ./Diffs$dir
python sumMetric.py Metrics$dir > Results$dir
python sumDiff.py Diffs$dir > Diffs$dir.csv
echo "finished."
#!/bin/bash
if [ $# -lt 1 ]
then
echo "LgEval relabelEdges: Edge relabeling tool"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: relabelEdges <file.lg> [ Label1 ] [ Replacement1 ] ..."
echo ""
echo "Replace edge labels in a 'raw' label graph file (using node (N) and"
echo "edge (E) entries for primitives). A list of edge labels and their"
echo "replacements are provided as arguments."
echo ""
echo "This was created to handle cases where * was used to represent"
echo "merge relationships, and edge relationships match symbol labels."
echo "These labels will conflict with segmentation edges in the new"
echo "label graph representation (e.g. if R is used both to label"
echo "symbols and to represent a Right-of relationship)."
exit 0
fi
grep "^N,\|#" $1 > HEAD_TEMP
grep "^E," $1 > EDGE_TEMP
# Shift the arguments to allow for simple iteration through pairs.
shift
while test $# -gt 0
do
perl -p -i -e "s/$1/$2/g" EDGE_TEMP
shift
shift
done
cat HEAD_TEMP
cat EDGE_TEMP
rm -f HEAD_TEMP EDGE_TEMP
exit 0
#!/bin/bash
if [ $# -lt 1 ]
then
echo "LgEval relabelOldCROHME: Edge Relabeler for Old CROHME Files"
echo "Copyright (c) R. Zanibbi, H. Mouchere, 2012-2014"
echo ""
echo "Usage: relabelOldCROHME <dir>"
echo ""
echo "Used to relabel old 'raw' label graph files with N and E"
echo "entries, converting short names for relationships to the"
echo "longer ones used for CROHME 2014."
exit 0
fi
for file in $1/*.lg
do
# Replace (R)ight, (A)bove, (B)elow and (I)nside relationships.
relabelEdges $file R Right A Above B Below I Inside > tempFile
mv tempFile $file
done
0% Chargement en cours ou .
You are about to add 0 people to the discussion. Proceed with caution.
Terminez d'abord l'édition de ce message.
Veuillez vous inscrire ou vous pour commenter