8.   Neural Filters
8.1     Test-Driving the Label Matching Problem
8.2   Neural Filter Overview
8.3   Training
8.4   N:N Matching
8.5   1:N Matching
8.6   Checking the Results
8.7   Analysis
8.8   Summary



8.   Neural Filters

Before the ABM Engine processes an image directly (Input Space Matching), it will go through a pre-matching step, called Feature Space Matching. The purpose of Feature Recognition is to eliminate unmatched images. This Feature Recognition sub-layer has two filters:

BioFilter; and
NeuralFilter.


Additional Feature Space Filters  (for example, the filter used in the software Attrasoft DecisionMaker) can be ordered in a customized version.

The BioFilter will eliminate about 80% of the unmatched images and the Neural Filter will eliminate an additional 19%, leaving only about 1% unmatched images.

Figure 8.1   Selecting NeuralFilter.

Operating the Neural Filter is exactly the same as the BioFilter. This chapter will demonstrate how the Neural Filter works using the Label-Matching example of the last chapter. The Neural Filter is more accurate. The Neural Filter will automatically include the BioFilter. Also, the Neural Filter only has Supervised Learning; therefore, the Neural Filter must be trained before it can be used.

8.1     Test-Driving the Label Matching Problem

We will go through the following steps in this section:

Initialization sets the ImageFinder parameters. The ImageFinder then maps an image into a record in a feature space. Finally, we will do a matching.

Figure 8.2   NeuralFilter Menu.

Figure 8.3   �NeuralFilter\Query Set + Target Set� Menu; please see the Reference menu chapter.

There are 304 images in this example; they are located at the directory �.\biofilterex1�, where �.\� in the ImageFinder software location. The N:N Match will compare each of the 304 images with all 304 images.

Figure 8.4  An image in the �.\biofilterex1� folder.

Figure 8.5  NeuralFilter Parameter.

Initialization of Neural Filter


Converting Images to Records

Training Template Matching Results
C:\�\L01008gi_r90.jpg
C:\�\L01008gi_r90.jpg
C:\�\L01008gi-020501.jpg


The result file contains many blocks. The number of blocks is the same as the number of images in the search-directory, i.e. each image has a block.  Line 1 in each block is input and the rest of the lines are output; i.e. the first line is the image matched against all images in the search-directory, the rest of the lines represent the matched images. For example, �C:\�\L01008gi_r90.jpg� is matched against all 304 images in the search-directory, and there are two matches, listed in the next two lines.

8.2   Neural Filter Overview

The Image Matching will be done in several steps:


Let us look at each phase.

Initialization

Initialization sets the ImageFinder parameters.
Converting Images to Records if you have not done so in the BioFilter:
An image is mapped into a record in a feature space. This step is slow (several images per second); however:

(1) This step can be done once for all; and
(2) This is linear, i.e. the time is directly proportional to the number of images.

Therefore, this step does not have much impact on the operating speed.


Training


Template Matching

The matching speed will be between 100,000 � 1,000,000 comparisons per second.
Both BioFilter and Neural Filter will do the Template Matching.


8.3   Training

Training teaches the ImageFinder what to look for. Each filter is trained differently.

If you have done the BioFilter training in the last chapter, all you have to do is to click �NeuralFilter\Train (match.txt required)� to train the Neural Filter.

For both the BioFilter and the Neural Filter, training requires two files, a1.txt and match.txt:


These two files for training are fixed; you cannot change the names of these two files. You obtain a1.txt automatically, as you convert images to records. You have to prepare match.txt for each problem.

The match.txt looks like this:

152
1    L01008gi_r90           L01008gi-020501
2    L01008KEY_m        L01008key-082301_m
3    L010103C                L010103C-081502_m
4    L01010co_m            L01010CODE_m
5    L010163C_m           L010163C-083100_m

Line 1 is the number of matches in this file. This match file indicates image, L01008gi_r90, will match with image, L01008gi-020501. Each line has the following format:

Number, tab, filename1, tab, filename1, tab.

Note:
(1) You must have a tab at the end of each line;
(2) The file names do not contain �.jpg�.

There are two common errors:

(1) The last Tab is missing;
(2) The number of rows is less than the first number in the file.

Once you get the two files prepared, click �NeuralFilter\Train (match.txt required)� to train the Neural Filter.

To continue the Label Recognition example, we must prepare the match.txt file now. This file is already prepared for you and we will simply open it and save it to match.txt. The steps are:

Now the ImageFinder is trained for the Label Recognition problem. We will continue this example in the next section, N:N Matching.

8.4   N:N Matching

N: N Matching compares each image specified in the search-directory or search-file, with every image in the search-directory or search-file. N: N Matching is further divided into N: N Matching and N : (N-1)/2 Matching.

Let a and b be two images; N:N Matching is {aa, ab, ba, bb} and the N : (N-1) Matching is {ab}. The N: N Matching has N * N comparisons; and the N : (N-1)/2 Matching has N * (N-1 )/2 comparisons.  The purpose of N : (N-1)/2 Matching is to reduce the number of comparisons.

Now, go back to our example. Click �NeuralFilter/NeuralFilter N:N Match�; we get b1.txt, which will be opened right after the click:

C:\�\L01008gi_r90.jpg
C:\�\L01008gi_r90.jpg
C:\�\L01008gi-020501.jpg

C:\�\L01008KEY_m.jpg
C:\�\L01008KEY_m.jpg
C:\�\L01008key-082301_m.jpg


Again, line 1 in each block is input and the rest of the lines are output. Go all the way to the end of the file; the last line is:

Total Number of Matches = 608


There are 152 pairs or 304 images. Each image will match itself and its partner in the pair, giving a total of 608 matches. As we will see next, all of the 608 matches and only the 608 matches are identified.

Click �NeuralFilter/NeuralFilter N:(N-1) Match�; we get b1.txt, which will be opened right after the click; the last line is:

Total Number of Matches = 152


There are 152 pairs or 304 images. The first image in a pair will match its partner in the pair, but the second image in a pair will not match the first one, giving a total of 152 matches. As we will see next, all of the 152 matches and only the 152 matches are identified.

8.5   1:N Matching

Again, 1:N Matching compares one key image with the images in a search-directory or search-file; the key image is specified in the �Key Segment� textbox or selected by the �Key Segment� button.

To continue the Label Recognition problem for Supervised 1:N Matching:

The result is in file, b1.txt, which will be opened at this point:
C:\�\biofilterex1\L01008gi-020501.jpg
C:\�\L01008gi_r90.jpg
C:\�\L01008gi-020501.jpg

Total Number of Matches = 2

This result is correct. We will continue this example in the next section, Checking.

8.6   Checking the Results

If this is a test run (i.e. you know the correct answers) you can see the matching results in seconds. You must prepare a file, which indicates the matching pairs. To test the results in b1.txt, you must prepare B1_matchlist.txt file.

If you have done the BioFilter checking in the last chapter, all you have to do is to click NeuralFilter/Check (b1_matchlist.txt required)� to check the result of the NeuralFilter.

An example of b1_matchlist.txt is:

152
1    L01008gi_r90           L01008gi-020501
2    L01008KEY_m        L01008key-082301_m
3    L010103C                L010103C-081502_m
4    L01010co_m            L01010CODE_m
5    L010163C_m           L010163C-083100_m

Line 1 is the number of matches in this file. The format is exactly the same as match.txt.

Number, tab, filename, tab, filename, tab.

Note:
You must have a tab at the end of each line;
The file names do not contain �.jpg�.

There are two common errors:

To continue the Label Recognition example, we must prepare the b1_matchlist.txt file now. This file is already prepared for you and we will simply open it and save it to b1_matchlist.txt (overwrite the existing file). The steps are: You will see the something like following in the text window:
Checking Template Matching Results!
Get b1.txt...
Character = 68045
Lines = 1459
Blocks = 305
Get b1_matchlist.txt...

Check...
Total Matches = 608

The message indicates b1.txt has 305 blocks: the 304 image blocks plus the last line indicating the total number of matches retrieved. The message �Total Matches = 608� indicates that 608 matches in b1.txt agrees with those in b1_matchlist.txt.

8.7   Analysis
 

Possible Matches

Let the Total Images in the input file be N; the Possible Matches will be N*N. In our example, N * N = 304 * 304 = 92,416.
Attrasoft Matches
The number of retrieved matches is listed in the last line of b1.txt. Go to the end of b1.txt, you will see something like this:
Total Number of Matches = 608
Actual Matches
This number depends on your problem. It should be the first number in b1_matchlist.txt. In our example, it is 608.
Attrasoft Found Duplicates
Click the �Results� button to get this number, as discussed in the last section.


Now that you have all of the numbers, you can make an analysis.

Positive Identification Rate

Positive Identification Rate = the result of clicking �Neural Filter/Check (b1_matchlist.txt required)� menu item divided by the first number in file, b1_matchlist.txt. In our example, 608/608 = 100%, i.e. the

Positive Identification  Rate is 100%.

Elimination Rate
The Elimination Rate is 1 minus the number at the end of b1.txt divided by the number of possible matches. In our example, this number is:

Elimination Rate  = 100%
= (1 � 608/92,416 ) / (1 � 608/92,416 ).

Hit Ratio
The Hit Ratio is the number indicated by �Neural Filter/Check (b1_matchlist.txt required)� menu item divided by the number at the end of b1.txt. In our example, this number is: 608/608 = 100%.

Hit Ratio = 100%.

Composite Index
Finally, an identification is measured by multiplication of Positive Identification Rate * Elimination Rate * Hit Ratio. In our example, this number is 100% * 100% * 100% = 100 %.

Composite Index = 100%.


8.8   Summary

Summary of the steps for N:N Matching:

I.  Preparations

Data

Data is stored at �.\biofilterex1�. There are 304 images.
Training File
1. Training file must have the name �match.txt�;
2. Find the file, �.\biofilterex1_match.txt�, and save it to match.txt.
Checking File
1. Checking file must have the name �b1_matchlist.txt�;
2. Find the file, �.\biofilterex1_matchlist.txt�, and save it to b1_matchlist.txt.


II. Operation

Return