Program proving is basically exert in continuous exploration, discovering and interviewing. This exert becomes very interesting and disputing at times, when the application blow test is as complex as Maps. You must have utilized applications like Google function, Yahoo function, etc main use of these applications is to help users find the route. As a contribution to these applications, the user gives source and destination and based on this information, maps to give directions to get from origin to destination. You might think that the description that the application is simple but has many challenges.
As a tester you need to find out relevant consultations and the quality of the results produced by the system.
During beta testing the application, we have thousands of queries and input data that were used by end users. To give an idea about how much data we had, for each city there are over 8,000 consultations. For example, Mumbai Hotels, Mumbai Escorts, Mumbai Taj, etc. to find relevant data of these consultations is a very difficult and time consuming.
These data can be analyzed for appropriate consultation in two different forms, or human resources are applied to analyze this or the use of Artificial Intelligence and write some clever tool. Given that human resources is too expensive to get:) we decided to develop a tool to classify the input data.
After seeing the various possible solutions, we decided to use Bayesian classifier. For people who are interested in learning more about Bayesian classifier, this is what Wikipedia says about him.
Bayes 'low (as well known as Bayes pattern or Bayes' law) is a answer in probability theory, which concerns the conditional probability distributions of random variables and marginal. In some interpretations of probability, Bayes' theorem tells how to modify or revise beliefs in light of new evidence a posteriori.
The probability of event A conditional on another event B is generally different from the probability of B conditional on A. However, there is a definite relationship between the two, and Bayes' theorem is a statement of that relationship.
The use of the classifier based on Bayesian theorem is well known in the filtering of spam email. Generally, spam filters, have a large dataset in terms of good and spam email. Works on the probability that certain words are present in the spam messages instead of normal mail. Filtration System spam email also learn that it is to users every time the user successfully Report spam or not spam button.
So we write our own tool based on the theorem Bayseian with the ability to learn what is a fact that good and evil is a fact. This tool will learn to classify data based on how they train. In simple terms, the entry of the tool would be the definition of what is good, the bad and the sample data. Based on this, it will sort the data in good or bad, simple as that.
Normally, to classify a set of text, we must teach the tool that is good and bad. During the training, sorter track how often words are classified as good or bad are occurring in each category.
Application program
This tool was arose in crimson, as the collection of Lucas Carlson classifier gem is now available as a finisher. This library features a naive Bayesian classifier. More data about this can be base here.
In our application, following code reads three files
* Good.yml
* Not_good.yml
* Input File
For implementation, we must give two arguments to the command line. City name and input file name. Now, on the basis of the definition of good and evil, it will create a directory named for the city and put good.txt and bad.txt in that directory with information classified as good or bad.
require 'Stemmer'
require 'classifier'
if ARGV.empty?
puts "*** You must provide names and city name input to the script file **** \ n"
else if ARGV [1]
puts "I am in search of the city # (ARGV [0]) \ n"
puts "The input file is # (ARGV [1]) \ n"
BATCH = ARGV [1]. to_s.downcase
pwd = Dir.getwd
City = ARGV [0]. to_s.downcase]. to_s.downcase
Dir.mkdir ( "# (city)")
The load ratings above #
= Good YAML:: LOAD_FILE ( 'good.yml')
Not good = YAML:: LOAD_FILE ( 'not_good.yml')
data = File.open ( "# () input file", "r")
Goody = File.open ( "# (PWD)" + "\ \ "+"#{ city)" + "\ \ good.txt", "A")
nogood = File.open ( "# (PWD)" + "\ \ "+"#{ city)" + "\ \ nogood.txt", "A")
classifier = Classifier:: Bayes.new ( 'good', 'not good')
# Train the classifier
not_good.each (| | not_good not_good classifier.train_no_good)
good.each (| | good_one good_one classifier.train_good)
while Line3 = data.gets
if classifier.classify (line 3) == "Good"
goody.write Line3
else
no good. write Line3
end
end
else
puts "*** The second argument is the file name is required *** \ n"
end
end
Quality of results
The quality of the result depends on the amount of training we have given to the classifier. It is a kind of apprenticeship system where the quality of the result depends on training. The main advantage of this approach is the reduction in human effort needed to sort the data. Similar to this, there are many applications where human intervention is needed to classify what is good and bad. A properly trained classifier similar to this can be useful in similar situations.
We hope this interesting article and you will be able to use it if necessary to classify the data for your application.
0 comments:
Post a Comment