Attrasoft ImageFinder

18.   BIOFILTER II
18.1   Input
18.2   Data
18.2.1   Good Match
18.2.2   Bad Match
18.3   Templates
18.3.1   File Input
18.3.2   Directory Input
18.4   Training
18.5   Training the BioFilter II
18.6   1:N and N:N Matching Example 1
18.7   1:N and N:N Matching Example 2
18.8   Two-Layer Neural Net Architecture

18.   BioFilter II
Until now, we have been using one-layer Neural Network Architecture:
Image Preprocessing
Edge Filters;
Threshold Filters; and
Clean Up Filters.
Normalization
Reduction Filter.
Feature Recognition
BioFilter;
NeuralFilter.
Pixel Level Recognition
ABM Filter.
We call this Basic Architecture. When you go beyond the Basic Architecture, the options are limitless.
We will focus our discussion on a specific application: to compare two basically identical images and identify the minor differences. For example, in two satellite images, there is a car in one image, but not in the other. Potential applications are:
Satellite Image Recognition
Quality Control
Product Label Recognition
…

Our approach is to divide the image into small sections and use our Basic Architecture for each smaller section of the image. This will generate the following additional layers:
BioFilter II;
NeuralFilter II;
ABM Filter II.
In this multi-layered Neural Network Architecture, Layer 1 (Basic Architecture) serves as a filter for later layers (i.e., to decide quickly whether an image can pass or not). It simply decides whether two images are similar with potentially minor differences, or whether these are two completely different images.
Throughout this chapter, we will focus on a Label Verification problem.
18.1   Input
An input file or an input directory enters data into the BioFilter II. An input file is a text file, which looks like this:
C:\abc1\efg1\image0001.jpg
C:\abc2\efg2\image0002jpg
…
This file lists one image per line. Please do not add a blank space to the end of the line. The first line in the file is the newly captured image, which will be matched against the previously stored images specified by the rest of the line. The number of the previously stored images is arbitrary; for example, in this input file:
C:\newlycaptured\L12063.jpg
C:\masterdata1\L12063.jpg
C:\masterdata2\L12063.jpg
C:\masterdata3\L12063.jpg
The newly captured image, C:\newlycaptured\L12063.jpg, will be matched against three previously stored images.
Multiple matches cannot be entered in a single input file; each match must have its own file. In the case of directory input, the first image will be considered as the “newly captured image” and the rest of the images will be considered “previously stored images”. Because it is hard to control the image scanning order, the file input is preferred over the directory input.
18.2   Data
The Label Verification problem is to make sure the new product images match the existing images. The requirements are to compare a newly captured image against previously stored images and determine:

Match or No Match?

If not, where is the error?

This section presents two pairs of images used in this chapter. The first pair of images match, and the second pair does not match.
18.2.1   Good Match
Good Match verifies that the newly captured image matches the existing images. Figure 18.1 shows a Good Match.

Figure 18.1 Good Match.
18.2.2   Bad Match
Bad Match rejects the newly captured image because it does not match the existing images. The following figure is an example:

.

Figure 18.2 Bad Match.
The errors are:
Error       Difference                  Coordinate
Number
1
0g
vs.
4%                                  21

2
Canned vegetable and
Fruits for good nutrition
Try a variety of key food's
Vs.
Fruits for good nutrition
Try a variety of key food's
Canned vegetable and        22, 32

3
Extra word "Cranberry"          13
The images are 200 pixels per inch. The coordinate 21 means this: starting from upper left corner, go 2 inches to the right, 1 inch down and look at the 1 inch square box.
18.3   Templates

Figure 18.3   BioFilter2 Menu.
BioFilter II is exactly the same as BioFilter, except it operates on a section of an image, instead of whole image. It will convert a section of an image into a template. In this section, we will see both File Input and Directory Input.
18.3.1   File Input
1. Click the “File Input” button and open “biofilter2_ex1_input.txt; you will see:
9961400239\9961400239 1001.jpg
9961400239\9961400239 1002.jpg
2. Click “BioFilter 2/Scan Images – File Input” to convert images into template records.
3. Open d1.txt to see the results.
18.3.2   Directory Input
1. Click the “Search Dir” button and select 9961400239\9961400239 1001.jpg.
2. Click “BioFilter 2/Scan Images – Directory Input” to convert images into template records.
3. Open d1.txt to see the results.
18.4   Training
Unlike the BioFilter, which supports unsupervised learning, the BioFilter II must be trained. Training teaches the BioFilter II what to look for.
Each filter is trained differently. For the BioFilter II, training requires two files, d1.txt and match2.txt:

D1.txt is the record file, which contains many records. Each image is converted into multiple records. A record represents features of an image segment in a feature space.

Match2.txt is a list of matching pairs. This file will teach the ImageFinder who will match with whom.

These two files for training are fixed; you cannot change the names of these two files. You obtain d1.txt automatically, as you saw in the last section of this chapter. You have to prepare match2.txt for each problem.
The match2.txt looks like this:
1591
1   9961400239 1001_00   9961400239 1002_00
2   9961400239 1001_01   9961400239 1002_01
3   9961400239 1001_02   9961400239 1002_02
4   9961400239 1001_03   9961400239 1002_03
5   9961400239 1001_04   9961400239 1002_04
…
Line 1 is the number of matches in this file. This match file indicates images 9961400239 1001, section 00, will match with image 9961400239 1002, section 00. Here section (0,0) is the upper left corner. Each line has the following format:
Number, tab, filename1, tab, filename1, tab.
Note:
(1) You must have a tab at the end of each line;
(2) The file names do not contain “.jpg”.
There are two common errors:
(1) The last Tab is missing;
(2) The number of rows is less than the first number in the file.
Once you get the two files prepared, click “BioFilter 2\Train (match2.txt required)” to train the BioFilter II. After training, you can use these commands:
“BioFilter 2/1:N Match (First vs. Rest)”
“BioFilter 2/N:N Match”
The 1:N or N:N Match is based on the d1.txt file. In 1:N Matching, the first image in d1.txt will match the rest of the images. In N:N Matching, each image in d1.txt will match against the rest of the images.
18.5   Training the BioFilter II
To continue the Label Recognition example, we must prepare d1.txt and match2.txt file for training now. The data used for training consists of 50 matching pairs. The data is in “ \Biofilter2_good”.
You must prepare d1.txt in piecemeal for 2 reasons:
(1) The BioFilter II will use the first image in the input file or a directory to determine the region of interest. Because the images are different in size, you have to convert each image separately.
(2) It will take a long time to convert images into BioFilter II templates if many images are used, because the BioFilter II keeps all images in RAM.
All examples in this chapter use the same setting. You can click “Example/BioFilter 2/Label Match Setting” or any example under “Example/BioFilter 2” to open the parameters and then, click “Batch/Load” command to load. If you want to work manually, here are the parameters:
Edge Filters: Sobel 1
Threshold Filters: Dark Background 128
Clean Up Filter: Small
BioFilter: CL
BioFilter II: CL1
For example, to generate d1.txt for images in folder “.\9961400239\”,

Click the “Search Dir” button and select \9961400239.

Click “BioFilter 2/Scan Images – Directory Input” to convert images into template records.

We have prepared the d1.txt for you. Open biofilter2_d1.txt and save it to d1.txt.
The second training file is match2.txt, which specifies who should match with whom. This file is already prepared for you. Go to the ImageFinder folder. (The default folder is C:\program files\Attrasoft\ImageFinder 6.0\”and open the file, biofilter2_match2.txt. Save it to match2.txt (overwrite the existing file). Now the training file is prepared.
To train the BioFilter II, click “BioFilter 2/Train (match2.txt required)”. Now the BioFilter II is trained for the label recognition problem. You can save the trained results by clicking “Batch/Save” to save the trained BioFilter II.

18.6   1:N and N:N Matching Example 1
1:N Matching compares the first image in the template file, d1.txt, with the rest of the images in the file; the d1.txt, in turn, is generated by the input file or input directory. The images used in this example can be seen in Figure 18.1.
The data used in this example will be in biofilter2_ex1_input.txt. Operations for 1:N Matching:

Click “Example/BioFilter 2/Label Match Setting” to open the parameters;

Click “Batch/Load” command to load parameters. At this point, the BioFilter II is also trained.

Click the “File Input” button and open “biofilter2_ex1_input.txt; you will see:

9961400239\9961400239 1001.jpg
9961400239\9961400239 1002.jpg

Click “BioFilter 2/Scan Images – File Input” to convert images into template records.

Click “BioFilter 2/1:N Match (First vs. Rest)” to make a 1:N Match.

The results look like this:
C:\…\9961400239\9961400239 1001
0 0
0 1
Match!
0 2
Match!
…
3 9
Match!
Image Match
The BioFilter II divides the images into smaller sections. In this example, the sections are 00, 01, … 39. Here section (0,0) is the upper left corner. The output shows the matching result in each section. The above example shows the “newly captured image”, 9961400239\9961400239 1001.jpg, matches the “previously stored image”, 9961400239\9961400239 1002.jpg, in all sections. We now continue the example:

Click “BioFilter 2/N:N Match” to make a N:N match.

The results are:
C:\…\9961400239 1001
0 0
Match!
0 1
Match!
0 2
Match!
…
3 9
Match!
Image Match!

C:\…\9961400239 1002
0 0
Match!
0 1
Match!
0 2
Match!
…
3 9
Match!
Image Match!

These results show 9961400239 1001 matches 9961400239 1002 and 9961400239 1002 matches 9961400239 1001.
18.7 1:N and N:N Matching Example 2
In this section, we will present an example, which does not match. The input file is
biofilter2_ex2_input.txt.
The operations are:

Click “Example/BioFilter 2/Label Match Setting” to open the parameters;

Click “Batch/Load” command to load parameters. At this point, the BioFilter II is also trained.

Click the “File Input” button and open “biofilter2_ex2_input.txt; you will see:

biofilter2_good10/l01016key.jpg
biofilter2_bad10/l01016key.jpg

Click “BioFilter 2/Scan Images – File Input” to convert images into template records.

Click “BioFilter 2/1:N Match (First vs. Rest)” to make a 1:N Match.

The results are:
C:\…\biofilter2_good10/l01016key
0 0
Match!
…
1 3
No Match!
…
3 3
No Match!
…
6 0
No Match!
…
Image Does Not Match!

18.8   Two-Layer Neural Net Architecture
In the last two sections, we present a 'Matching' example and a 'No Match' example, using the BioFilter II alone. The examples in section use the two-layered Neural Network Architecture:
BioFilter
BioFilter II
The objective is to compare two basically identical images and identify the minor differences. Two different images do not fall into this category. The BioFilter will serve as a filter to get rid off different images and the BioFilter II will identify the minor differences.
The three examples used in this section are 1:N Match, N:N Match in section 6, and 1:N Match in section 7. This process will take many steps; the batch job is perfect for this type of work.
The operation for the first example is:
Click “Example/BioFilter 2/Label example 1 (1:N)”;
Click “Batch/Run”.
The operation for the second example is:
Click “Example/BioFilter 2/Label example 2 (N:N)”;
Click “Batch/Run”.
The operation for the third example is:
Click “Example/BioFilter 2/Label example 3 (1:N)”;
Click “Batch/Run”.
The results of these runs should match results earlier.

Return