Book Description
As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful information from raw data. This book describes these techniques and shows how they work.
The book is a major revision of the first edition that appeared in 1999. While the basic core remains the same, it has been updated to reflect the changes that have taken place over five years, and now has nearly double the references. The highlights for the new edition include thirty new technique sections; an enhanced Weka machine learning workbench, which now features an interactive interface; comprehensive information on neural networks; a new section on Bayesian networks; plus much more.
* Algorithmic methods at the heart of successful data miningincluding tried and true techniques as well as leading edge methods
* Performance improvement techniques that work by transforming the input or output
* Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualizationin a new, interactive interface
Download Description
Like the popular first edition, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. Inside, you'll learn all you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining-including both tried-and-true techniques of the past and Java-based methods at the leading edge of contemporary research. If you're involved at any level in the work of extracting usable knowledge from large collections of data, this clearly written and effectively illustrated book will prove an invaluable resource. Complementing the authors' instruction, including a fully-revised Chapter 8 and 30 new technique sections, is a fully functional platform-independent Java software s
Customer Reviews:
Customer Satisfaction.......2007-09-07
The book I got was in very good condition and the time it took for delivery was also good
A great introduction to data mining and machine learning.......2007-07-24
I bought this book in the hopes that it would help me better explore the data from the Netflix Prize contest, which it did. I had been reading numerous Wikipedia articles, scientific papers, etc. on line and felt it would be useful to have a more general tome on the subject. This book covers many of the common, overarching themes i.e. clustering, neural networks, linear regression, etc. to varing degree. I only wish the examples involved slightly more complex data sets and more pseudo code was provided. I suppose since the book is very closely tied to WEKA, one could always dig through the source code of that application; but I feel that the authors could have provided a bit more of the strictly algorithm relevant code in the book.
Incredibly Useful for theory and Practice.......2007-05-15
This book is by the guys who created WEKA. It covers a wide swath of machine learning without losing a dilligent reader. Then it walks the reader through using WEKA - a fantastically powerful open source tool.
Amazing.
wrong named.......2007-04-20
If "practical" means having a sample project with the theory, the name of this book is incorrectly named. With a big bunch of text or description, it is not "practical"!
Incredibly practical introduction.......2006-10-30
This book is perfect if you are trying to get your hands around what data mining and machine learning is. Most of the books I have read on this subject want to start with equations and get more complex from there, with little practicality. This book makes extensive use of examples and introduces the mathematical basis for algorithms where needed. The authors make the point that simpler algoritms often work best for solving machine learning problems. Similarly, I would argue, simpler books work best for understanding highly complex fields. I very highly recommend this book.
Book Description
During the past decade there has been an explosion in computation and information technology. With it has come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book descibes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It should be a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learing (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting--the first comprehensive treatment of this topic in any book. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie wrote much of the statistical modeling software in S-PLUS and invented principal curves and surfaces. Tibshirani proposed the Lasso and is co-author of the very successful
An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, and projection pursuit.
Customer Reviews:
Great statistics book........2007-09-24
I'm a machine learning person, and this book provides pretty thorough state-of-art and up-to-date (relatively well) summary of statistical methods being used in lots of pattern classification fields. One thing that does not exist in the book is generative models, although this book is the best of the kind that describes discriminitive models.
Most Useful Machine Learning Book.......2007-09-24
This book describes most of the important topics in machine learning. Most machine learning books just present a criterion and and an optimization algorithm. For instance, LDA is often presented as: here is the Fisher criterion, it seems like a good thing to maximize. "The Elements of Statistical Learning" also presents that this is the right criterion if the distributions of the data for each class are Gaussian with the same covariance. This book puts all the algorithms in the same statistical language, which makes them easy to compare and choose between.
I also appreciate the emphasis this book puts on algorithms that are more recently popular/effective. I very much appreciate the discussions of logistic regression vs. LDA, ridge and lasso regression, boosting/additive logistic regression and additive trees, decision and regression trees, ...
The only qualm I have with this book is that it is rather biased toward the authors' own research. It is difficult from reading this book alone to differentiate between classical techniques and the authors' recent proposed algorithms.
Best data mining book.......2007-09-21
If you are looking for a relatively rigorous but very readable data mining book, this is simply the best! It covers most of the modern techniques and is beautifully printed with high quality graphics.
A very introductory book and well-writen.......2007-02-05
The discussed book is very explanatory and could be students' material for academic lessons.
A must book for every statistician and data miner.......2007-01-18
This book has become a classic for any statistician and data miner by now.
It is a broad overview of regression, classification and clustering techniques (supervised and unsupervised machine learning).
Book Description
Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.
Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and applications. This new edition substantially enhances the first edition, and new chapters have been added to address recent developments on mining complex types of data including stream data, sequence data, graph structured data, social network data, and multi-relational data.
Whether you are a seasoned professional or a new student of data mining, this book has much to offer you:
* A comprehensive, practical look at the concepts and techniques you need to know to get the most out of real business data.
* Updates that incorporate input from readers, changes in the field, and more material on statistics and machine learning.
* Dozens of algorithms and implementation examples, all in easily understood pseudo-code and suitable for use in real-world, large-scale data mining projects.
* Complete classroom support for instructors at www.mkp.com/datamining2e companion site.
Customer Reviews:
Good high-level review with little mathematics........2006-12-08
This is a great textbook for an undergraduate or layperson to the information sciences, but specialists may find it lacking depth. It is very good at identifying practices and principles that would guide a high-level planner toward a sound research program. That said, this book exhaustively covers the breadth of the modern field at the expense of formulas, algorithms, and source code that would have been valuable to an engineer or scientist with plans to implement.
* Buy this book if you require a high-level understanding of the concepts and techniques used in the field.
* Don't buy this book if you are planning to specialize in data mining, or if you have plans to implement yourself.
Significant improvements since first edition.......2006-09-20
I have read the first edition of this book years before. This second edition has significant improvements. Core topic (classification, clustering, association rules) is very detailed and much easier to read. The author also add much material about advanced topics such as graph mining, multimedia mining, stream and time series mining, etc. Although these advanced topics are not as well writen as core topics, at least you will get idea about what's going on in these areas.
1 star is too much for this book.......2006-02-15
Do yourself a favor and stay away from this book. The book is put together by gathering some data mining concepts padded with tons of buzzwords to "teach" you about data mining. It really fails to teach you anything and it really succeeds to confuse the hell out of the reader. The language used on this book is the poorest I have ever seen in my entire life. The people how wrote this book and the publishers who published it should be ashamed of themselves. It is a shame that I had to pay for this piece of work because I have to read it for a course I am taking and I really dread opening it every week.
One of the Best Data Mining Books.......2005-10-30
This book is easily one of the best on data mining. The one flaw I see is the meager attention given to neural networks. Coverage is practical, not theoretical.
Best introduction I know.......2004-11-15
It is very easy to collect huge volumes of data - social statistics, bank records, biological data, and more - but very hard to pull useful facts out of the heap. This book is about processing large volumes of data in ways that let simple descriptions emerge.
This is an introductory level book, aimed at someone with reasonably good programming skills. A little facility with statistics might help, but certainly isn't necessary. The book starts gently, with some very basic questions: what is data mining exactly, when there seem to be so many definitions for the term? What is a data warehouse, and how does it differ from a database? Next, the authors address the data itself in terms of quality, usability, and organization for efficient access. The central chapters, 4 thhrough 8, address various kinds of query specification, kinds of relationships to extract, correlations, clustering, and classification. None of the discussions is especially deep. All, however, are presented in pseudocode or simple math that can easily be translated into working code. The careful reader learns a few basic principles that work well in many contexts: entropy maximization, Bayesian analysis, and simple stats. It may be surprising to see how little of normal statistical analysis is used. I suspect the authors assume that stats-savvy readers will already know how to apply significance testing, and that stats-naive readers don't need the distraction. The last chapters discuss complex data, where the best structure for the data and the questions to be asked of it are not at all obvious, and tools and applications used in data mining.
The book is nicely laid out as a textbook, with an orderly summary, problem set, and bibliography at the end of each chapter. The bibliography is more than just a list of names and authors - it actually helps the reader decide which references will give the best description of each of the chapter's topics.
This is a clear, usable introduction to data mining: the data it uses, the questions it answers, and the techniques for connecting them. It gives codable detail for lots of techniques, and prepares the reader for more advanced discussions. I recommend it very highly.
//wiredweird
Average customer rating:
- An essential book for statistical analysts building predictive models for database marketing
- Data Mining for Database marketing
- "EDA III" for Database Marketing
|
Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data
Bruce Ratner
Manufacturer: Chapman & Hall/CRC
ProductGroup: Book
Binding: Hardcover
General
| Business & Investing
| Subjects
| Books
Management & Leadership
| Business & Investing
| Subjects
| Books
| Business Ethics
| Consolidation & Merger
| Decision-Making & Problem Solving
| Distribution & Warehouse Management
| Industrial
| Information Management
| Leadership
| Management
| Management Science
| Motivational
| Negotiating
| Operations Research
| Planning & Forecasting
| Pricing
| Production & Operations
| Project Management
| Quality Control
| Risk Assessment
| Statistics
| Strategy & Competition
| Systems & Planning
| Systems Analysis
| Teams
| Total Quality Management
| Training
Direct
| Marketing
| Marketing & Sales
| Business & Investing
| Subjects
| Books
General
| Marketing
| Marketing & Sales
| Business & Investing
| Subjects
| Books
Multilevel
| Marketing
| Marketing & Sales
| Business & Investing
| Subjects
| Books
General
| Sales & Selling
| Marketing & Sales
| Business & Investing
| Subjects
| Books
General
| Reference
| Business & Investing
| Subjects
| Books
Data Mining
| Databases
| Computers & Internet
| Subjects
| Books
General
| Databases
| Computers & Internet
| Subjects
| Books
Probability & Statistics
| Applied
| Mathematics
| Science
| Subjects
| Books
Statistics
| Applied
| Mathematics
| Professional Science
| Professional & Technical
| Subjects
| Books
General
| Reference
| Subjects
| Books
All Titles
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Business & Investing
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Computers & Internet
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Professional
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Reference
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Science
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Similar Items:
-
Optimal Database Marketing: Strategy, Development, and Data Mining
-
Data Mining Cookbook: Modeling Data for Marketing, Risk and Customer Relationship Management
-
Strategic Database Marketing
-
Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management
-
Applied Data Mining: Statistical Methods for Business and Industry (Statistics in Practice)
ASIN: 1574443445 |
Book Description
Traditional statistical methods are limited in their ability to meet the modern challenge of mining large amounts of data. Data miners, analysts, and statisticians are searching for innovative new data mining techniques with greater predictive power, an attribute critical for reliable models and analyses. Statistical Modeling and Analysis for Database Marketing: Effective Techniques for Mining Big Data delivers a collection of successful database marketing methodologies for big data. This compendium solves common database marketing problems by applying new hybrid modeling techniques that combine traditional statistical and new machine learning methods. The book delivers a thorough analysis of these cutting-edge techniques, which include non-statistical machine learning and genetic intelligent hybrid models. By following the step-by-step procedures detailed in the text, database marketing professionals can learn how to apply the proper statistical techniques to any database marketing challenge. The practical case studies and examples provided involve real problems and real data, and are taken from a variety of industries, including banking, insurance, finance, retail, and telecommunications.
Customer Reviews:
An essential book for statistical analysts building predictive models for database marketing.......2006-01-05
This is a must have introductory book for the practitioner using data mining to build predictive models in industry. While it does have a few snippets of SAS code, it is a conceptual book that explains the "why" and the "how" of practical model building. (If you want SAS code buy "The Data Mining Cookbook" by Olivia Parr Rud.) It dispenses of with the antiquated notion of the "true" model of classical statistics and econometrics, and shows how to arrive at an acceptable model that yeilds good predictions. As practitioner's, this is what we care about most. Among other things, it gives good explanations of: (1) the EDA paradigm versus classical statistics (2) Tukey's bulging rule for transforming variables (3) variable selection, though there is no mention of clustering to eliminate redundant variables. It discusses some of the weaknesses of automatic variable selection methods (4) smoothed scatterplots and logit plots (5) decile analysis and using bootstrapping to derive confidence intervals for cum lift.
The book shows you how to use logistic regression, OLS, and CHAID to build predictive models. For those interested in Genetic modeling, it has a clearly written chapter on the subject that explains how genetic modeling can be used to create new variables that can have more information than either of the original variables.
While this book does not cover everything, and is definitely not the last word on the subject, it is a solid first word. In particular, the book does not cover splines, shrinkage techniques such as model averaging, ridge regression, ..etc. For treatments of these and similar advanced topics see Frank Harrell's "Regression Modeling Strategies" and Hastie, Tibsharani and Friedman's "Elements of Statistical Learning".
Data Mining for Database marketing.......2003-06-10
I predict that Dr. Ratner's Statistical Modeling and Analysis for Database Marketers: Effective Techniques for Mining Big Data will be on every database marketer's bookshelf. Dr Ratner has put together an assembly of chapters that provide an indispensable resource for the daily problems facing data analysts and model builders in the database/direct marketing community. In each of the seveenteen chatpers Dr. Ratner addresses a typical problem and discusses the common solution. He points out unknown working assumptions or weaknesses of the latter, and then offers better solutions, which require basic knowledge of EDA/data mining. Dr. Ratner's writing style is unique as he makes familar concepts new, and new concepts familar. Thus, the book is easy and enjoyable reading. I specially like chapter that blends statistics with the machine learning, such as the introduction of the GenIQ Model.
"EDA III" for Database Marketing.......2003-06-10
I consider myself fortunate to be the first to review this book. The title aptly indicates what the book is about: Statistical Modeling and Analysis for Database Marketers: Effective Techniques for Mining Big Data. The author provides in a Tukey-esque manner a collection of solutions to common problems facing database analysts, model builders, and marketers. The book can uniquely serve as a textbook, a how-to guide, and a reference source depending on the reader's statistical training and database marketing experience. Moreover, the author actually goes where other authors provide lip service: he creates the marriage of the "old" statistical methodologies with the new machine learning influence by introducing machine learning methods specifically tailored to database assessment of optimal model performance. The book's illustrations involve real problems, real data, and better solutions. This book is a keeper!
Average customer rating:
- A practical introduction to programming for biologists
- Reasonable book for learning Perl
- Mostly for the Biologist
- Beginning Perl for BioInformatics
- excellent introduction...
|
Beginning Perl for Bioinformatics
James Tisdall
Manufacturer: O'Reilly Media, Inc.
ProductGroup: Book
Binding: Paperback
Data Mining
| Databases
| Computers & Internet
| Subjects
| Books
General
| Introductory & Beginning
| Programming
| Computers & Internet
| Subjects
| Books
General
| Programming
| Computers & Internet
| Subjects
| Books
General
| Languages & Tools
| Programming
| Computers & Internet
| Subjects
| Books
General
| Programming
| Web Development
| Computers & Internet
| Subjects
| Books
General
| Computers & Internet
| Subjects
| Books
General
| Software
| Computers & Internet
| Subjects
| Books
Molecular Biology
| Biology
| Biological Sciences
| Science
| Subjects
| Books
Bioinformatics
| Biological Sciences
| Science
| Subjects
| Books
Molecular Biology
| Biology
| Biological Sciences
| Professional Science
| Professional & Technical
| Subjects
| Books
Mathematics
| Professional Science
| Professional & Technical
| Subjects
| Books
| Applied
| Chaos & Systems
| Geometry & Topology
| Mathematical Analysis
| Mathematical Physics
| Number Systems
| Pure Mathematics
| Transformations
| Trigonometry
Perl
| Programming
| O'Reilly
| By Publisher
| Books
General
| Programming
| O'Reilly
| By Publisher
| Books
All Titles
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Computers & Internet
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Professional
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Science
| Qualifying Textbooks - Fall 2007
| Stores
| Books
Similar Items:
-
Mastering Perl for Bioinformatics
-
Developing Bioinformatics Computer Skills
-
Bioinformatics For Dummies (For Dummies Series)
-
An Introduction to Bioinformatics Algorithms (Computational Molecular Biology)
-
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
ASIN: 0596000804 |
Amazon.com
Biology, it seems, is a good showcase for the talents of Perl. Newcomers to Perl who understand biological information will find James Tisdall's Beginning Perl for Bioinformatics to be an excellent compendium of examples. Teachers of Perl will likewise find the text to be filled with fresh programming illustrations of growing scientific importance. Seasoned Perlmongers who want to learn biology, however, should search elsewhere, as Tisdall's emphasis is on Perl's logic rather than Mother Nature's.
Departing from O'Reilly's earlier monograph Developing Bioinformatic Computer Skills, Tisdall's text is organized aggressively along didactic lines. Nearly all of the 13 chapters begin with twin bullet lists of Perl programming tools and the bioinformatic methods that require them. Likewise, the chapters end with exercises. String concatenation is illustrated with gene splicing, and regular expressions are taught with gene transcription and motif searching.
Tisdall emphasizes sequence examples throughout, leading up to an introduction to a Perl interface for the NIH GenBank biological database and the widely used BLAST sequence alignment tool. After a brief discussion of three-dimensional protein structure, he returns to sequence extraction and secondary structure prediction.
Tisdall's goal is to boost the beginning programmer into a domain of self-learning. He imparts essential etiquette for the success of programming newbies: use the wealth or resources available, from user documentation to Web site surveys to FAQs to How-To's to news groups and finally to direct personal appeals for help from a senior colleague. A well-plugged-in bioinformatics Perl student will soon discover Bioperl, an open-source effort to bring research-grade bioinformatic tools to the Perl community. Bioperl is described briefly at the end of Tisdall's book and will reportedly be a forthcoming title of its own in the O'Reilly bioinformatics series.
Although he introduces bioinformatics as an academic discipline, Tisdall treats it as a trade throughout his book. He indicates that open questions and computational hard problems exist, but does not describe what they are or how they are being tackled. Ultimately, Tisdall presents bioinformatics as another arrow in a bench scientist's quiver, very much like HPLC, 2D-PAGE, and the various spectroscopies.
As odd as a "bioinformatics-as-tool" book may be to its research proponents, the reduction of bioinformatics to trade status both deflates and vindicates the years of research, as Tisdall's work attests. --Peter Leopold
Book Description
With its highly developed capacity to detect patterns in data, Perl has become one of the most popular languages for biological data analysis. But if you're a biologist with little or no programming experience, starting out in Perl can be a challenge. Many biologists have a difficult time learning how to apply the language to bioinformatics. The most popular Perl programming books are often too theoretical and too focused on computer science for a non-programming biologist who needs to solve very specific problems. Beginning Perl for Bioinformatics is designed to get you quickly over the Perl language barrier by approaching programming as an important new laboratory skill, revealing Perl programs and techniques that are immediately useful in the lab. Each chapter focuses on solving a particular bioinformatics problem or class of problems, starting with the simplest and increasing in complexity as the book progresses. Each chapter includes programming exercises and teaches bioinformatics by showing and modifying programs that deal with various kinds of practical biological problems. By the end of the book you'll have a solid understanding of Perl basics, a collection of programs for such tasks as parsing BLAST and GenBank, and the skills to take on more advanced bioinformatics programming. Some of the later chapters focus in greater detail on specific bioinformatics topics. This book is suitable for use as a classroom textbook, for self-study, and as a reference. The book covers:
- Programming basics and working with DNA sequences and strings
- Debugging your code
- Simulating gene mutations using random number generators
- Regular expressions and finding motifs in data
- Arrays, hashes, and relational databases
- Regular expressions and restriction maps
- Using Perl to parse PDB records, annotations in GenBank, and BLAST output
Customer Reviews:
A practical introduction to programming for biologists.......2006-12-31
Although this book was written for biologists with no previous programming experience who have decided they need to learn to program in PERL, it is also useful for programmers entering the field of bioinformatics who need to learn the language. However, you should have some background in biology or else you'll be lost as to the purpose of the examples. That's because almost all of the examples and exercises are based on real biological problems, and this book will give you a good introduction to the most common bioinformatics programming problems and the most common computer-based biological data. This book is over five years old, but it still stands alone in that what it does it does better than any other book I've run across. The follow-on to this book is "Mastering Perl for Bioinformatics", and I recommend that book for both CS and biologist types that want to get into the more advanced parts of PERL and yet stay in the realm of learning the language via real biological problems. The following is a short run down of each chapter:
1. Biology and Computer Science - Covers some key concepts in molecular biology, as well as how biology and computer science fit together.
2. Getting Started with Perl - Shows you how to get Perl running on your computer and also talks about Perl's benefits.
3. The Art of Programming - Provides an overview as to how programmers accomplish their jobs. Some of the most important practical strategies good programmers use are explained, and where to find answers to questions that arise while you are programming is carefully laid out. These ideas are made concrete by brief narrative case studies that show how programmers, given a problem, find its solution.
4. Sequences and Strings - You start writing Perl programs with DNA and proteins. The programs transcribe DNA to RNA, concatenate sequences, make the reverse complement of DNA, and read sequence data from files. This is the first chapter to conclude with exercises.
5. Motifs and Loops - Continues demonstrating the basics of the Perl language with programs that search for motifs in DNA or protein, interact with users at the keyboard, write data to files, use loops and conditional tests, use regular expressions, and operate on strings and arrays.
6. Subroutines and Bugs -Extends the basic knowledge of Perl in two main directions: subroutines, which are an important way to structure programs, and the use of the Perl debugger, which can examine in detail a running Perl program.
7. Mutations and Randomizations - Genetic mutations, fundamental to biology, are modelled as random events using the random number generator in Perl. This chapter uses random numbers to generate DNA sequence data sets, and to repeatedly mutate DNA sequence. Loops, subroutines, and lexical scoping are also discussed.
8. The Genetic Code - How to translate DNA to proteins, using the genetic code. It also covers a good bit more of the Perl programming language, such as the hash data type, sorted and unsorted arrays, binary search, relational databases, and DBM, and how to handle FASTA formatted sequence data.
9. Restriction Maps and Regular Expressions - An introduction to Perl regular expressions. The main focus of the chapter is the development of a program to calculate a restriction map for a DNA sequence.
10. GenBank - The Genetic Sequence Data Bank (GenBank) is central to modern biology and bioinformatics. In this chapter, you learn how to write programs to extract information from GenBank files and libraries. You will also make a database to create your own rapid access lookups on a GenBank library.
11. Protein Data Bank - Develops a program that can parse Protein Data Bank (PDB) files. Some interesting Perl techniques are encountered while doing so, such as finding and iterating over lots of files and controlling other bioinformatics programs from a Perl program.
12. BLAST - Develops some code to parse a BLAST output file. Also mentioned are the Bioperl project and its BLAST parser, and some additional ways to format output in Perl.
13. Further Topics - Looks at topics beyond the scope of this book. These topics include sequence alignment methods like the Smith-Waterman algorithm and microarray techniques that enable the measurement of the relative levels of thousands of gene transcripts at a time. These topics are only briefly mentioned, and you are shown places outside of the book to get further information.
Appendix A - Resources for Perl and for bioinformatics programming, such as books and Internet sites.
Appendix B - Summary of those parts of the Perl language that will be most useful as you read this book.
Reasonable book for learning Perl.......2006-11-11
For the students of molecular biology and genetics, and also other bioinformatics-related departments, this book is an above-average supply to study Perl.
Mostly for the Biologist.......2006-09-20
People come to Bioinformatics from either the bio side or the CS side, with a few from various other disciplines. This book is best for the bio person who is getting into programming, not the programmer who is getting into bio.
For you CS types, I attended a tutorial by Tisdall on this material some years ago. One of the attendees asked why you needed an editor to code in Perl. That is the level that we are dealing with here!
It is a crime that biology and biochem students are not taught any perl- this is a very useful tool that will be more important as time goes on.
Perhaps someone could write a book on bioinformatics Perl for programmers someday, but that is not the goal of this book.
Beginning Perl for BioInformatics.......2006-01-17
Excellent book. The perl programming language is the most versitile and powerful software tool available today. This book was written for Biologists to learn this incredible programming language. Examples are pulled from real problems Biologists face and explained in terms they can understand. The book is clearly written.
excellent introduction..........2005-12-31
i find this book is an excellent intoudction for one of the most intersting topic..The book is so easy to read if you know the elementary of molecular biology and begining introudction about perl. I do recommed this book to start with if you interested about programming for bioinformatics. You will be able to build simple bioinformatics programmes after reading this book as well as you will be able to understand easily how the commerically avaliable bioinformatics programs are working.
Book Description
- SPSS (Statistical Package for the Social Sciences) is a data management and analysis software that allows users to generate solid, decision-making results by performing statistical analysis
- This book provides just the information needed: installing the software, entering data, setting up calculations, and analyzing data
- Covers computing cross tabulation, frequencies, descriptive ratios, means, bivariate and partial correlations, linear regression, and much more
- Explains how to output information into striking charts and graphs
- For ambitious users, also covers how to program SPSS to take their statistical analysis to the next level
Book Description
Extracting content from text continues to be an important research problem for information processing and management. Approaches to capture the semantics of text-based document collections may be based on Bayesian models, probability theory, vector space models, statistical models, or even graph theory.
As the volume of digitized textual media continues to grow, so does the need for designing robust, scalable indexing and search strategies (software) to meet a variety of user needs. Knowledge extraction or creation from text requires systematic yet reliable processing that can be codified and adapted for changing needs and environments.
This book will draw upon experts in both academia and industry to recommend practical approaches to the purification, indexing, and mining of textual information. It will address document identification, clustering and categorizing documents, cleaning text, and visualizing semantic models of text.
Customer Reviews:
subjective extraction of clusters.......2006-10-19
The book is relatively brief, given the technical nature of its chapters, each written by different authors. Many clustering methods are described. Most can be seen to have some degree of subjectivity, in defining what ends up in a given cluster. Or whether a cluster even exists or not.
The analysis of Web documents forms a major portion of the book. This data set is vast, continually changing and expanding. Plus, it is noisy. Unlike many clean data sets that might be extracted from a corpus of books, for example. Attention should be paid to methods of automatically extracting information from the Web.
The book does not go much into the higher level problems of defining ontologies. Which are very hard tasks. The closest it seems to get is along the lines of finding similar words in documents. Which is still very useful.
Book Description
It is now possible to predict the future when it comes to crime. In Data Mining and Predictive Analysis, Dr. Colleen McCue describes not only the possibilities for data mining to assist law enforcement professionals, but also provides real-world examples showing how data mining has identified crime trends, anticipated community hot-spots, and refined resource deployment decisions. In this book Dr. McCue describes her use of "off the shelf" software to graphically depict crime trends and to predict where future crimes are likely to occur. Armed with this data, law enforcement executives can develop "risk-based deployment strategies," that allow them to make informed and cost-efficient staffing decisions based on the likelihood of specific criminal activity.
Knowledge of advanced statistics is not a prerequisite for using Data Mining and Predictive Analysis. The book is a starting point for those thinking about using data mining in a law enforcement setting. It provides terminology, concepts, practical application of these concepts, and examples to highlight specific techniques and approaches in crime and intelligence analysis, which law enforcement and intelligence professionals can tailor to their own unique situation and responsibilities.
* Serves as a valuable reference tool for both the student and the law enforcement professional
* Contains practical information used in real-life law enforcement situations
* Approach is very user-friendly, conveying sophisticated analyses in practical terms
Customer Reviews:
Good introductory survey to various technologies.......2007-09-25
The book serves its purpose of providing a introduction to the various technologies that make up data mining. There are three main topic sections. The first gives an overview of the technologies involved such as fuzzy logic, bayesian probability, and neural networks. The second topic area is more concentrated and focuses on how data mining works. This involves utilizing clustering, association, and classification of data. The final section covers advanced topics in web, spatial, and temporal mining. The only complaint that I would have is that most of the coverage at least in section one is cursory and one needs other reference books for serious work in the field. A very strong feature of the book is that pseudocode algorithms are offered in many sections.
clarity in exposition.......2006-08-23
Dunham gives a clear explanation of the main ideas in data mining. It's a concise book, directed towards the researcher or programmer. Space considerations meant that some topics are only briefly but succinctly covered, like fuzzy logic.
More details are provided about neural networks, genetic algorithms and similarity measures. Bayesian classifications also get a good mention. Other classification measures involve distance-based methods to define clusters. For clustering, you should note that exactly what goes into a given cluster can be rather subjective. It could depend on your choice of metric.
There is a fair amount of maths. Accessible to someone with a couple of years of university level maths, especially involving linear algebra.
The section on Web mining is especially interesting. The Web is probably the largest database in the world. Certainly the most accessible. But with different characteristics from many other databases. Web data might be wrong, deliberately or otherwise. And some websites might be link farms, that try to pump up page rankings. Other databases simply don't have this concern about their contents. Dunham explains Google's PageRank and a competing idea from IBM.
The algorithms are given in pseudocode. Which should not be a problem to an experienced programmer. Translating these into your choice of language is (or at least it should be) a lesser conceptual task than understanding the methods themselves. Or devising new methods. The book also aids the latter. Dunham's descriptions of the overall logic behind each algorithm is a good lead into what is needed in construction new ones.
Good book for those interested in Data Mining, Machine Learn.......2004-11-03
Currently I am taking a Machine Learning course. This book is really helpful and intuitive. My friends who are studying Bioinformatics also found it useful.
All algorithms are presented in pseudo-code.
Book Description
Learn how to develop models for classification, prediction, and customer segmentation with the help of Data Mining for Business Intelligence
In today's world, businesses are becoming more capable of accessing their ideal consumers, and an understanding of data mining contributes to this success. Data Mining for Business Intelligence, which was developed from a course taught at the Massachusetts Institute of Technology's Sloan School of Management, and the University of Maryland's Smith School of Business, uses real data and actual cases to illustrate the applicability of data mining intelligence to the development of successful business models.
Featuring XLMiner, the Microsoft Office Excel add-in, this book allows readers to follow along and implement algorithms at their own speed, with a minimal learning curve. In addition, students and practitioners of data mining techniques are presented with hands-on, business-oriented applications. An abundant amount of exercises and examples are provided to motivate learning and understanding.
Data Mining for Business Intelligence:
* Provides both a theoretical and practical understanding of the key methods of classification, prediction, reduction, exploration, and affinity analysis
* Features a business decision-making context for these key methods
* Illustrates the application and interpretation of these methods using real business cases and data
This book helps readers understand the beneficial relationship that can be established between data mining and smart business practices, and is an excellent learning tool for creating valuable strategies and making wiser business decisions.
Customer Reviews:
An Excellent Introduction, Works with Excel.......2007-03-19
Data mining is the extraction of useful information from large amounts of data. Perhaps the best example of this is Amazon. If you go to Amazon to look at a book, you'll find such tidbits of information as a section on the page headlined 'Customers who bought this item also bought' and another 'What do customers ultimately buy after viewing this item?'
That's datamining, dozens or hundreds, or thousands of people looked at the page about this item. Then they went on to take these other actions. Among all the data that Amazon has collected they mine their database and pull out information to fill in these blocks.
This book, intended for MBA level students gives an excellent introduction to data mining. It further includes access to an Excel add-in called XLMiner that is specifically set up to allow the student to use Excel to learn how data mining is done.
The one thing I would ask the authors to do in their next edition is to provide a brief review of the commercially available data mining software products that are available. If not all of the software, perhaps just the top half dozen or so. In real life we aren't going to use Excel for data mining, our data resides in a database somewhere.
Condensed Discussion of DataMining.......2007-02-10
This book discusses some of the techniques used
in Data Mining.
It goes into Data Exploration as well as Evaluating
Classification and Predictive Performance.
Some of the more advanced techniques such as
Neural Nets and Cluster Analysis are
also discussed.
To learn more about database design and relational data modeling visit
[...]
From the authors:.......2007-01-27
This book got its start as notes for a data mining class that one of us (Nitin Patel) was teaching at MIT, and was completed while another of us (Galit Shmueli) was teaching a similar course at Maryland. Both courses were part of an MBA program. We found that, while there are a lot of books on data mining, there were none that actually gave business students the skills and tools to implement data mining algorithms. So we set ourselves the task of writing a book that (1) provides real data sets with a business decision-making context and a hands-on orientation , (2) provides a theoretical and practical understanding of the key data mining methods of classification, prediction, data reduction and exploration at a level that is appropriate and useful for MBA's, and (3) bundles a powerful version of a commercial data mining tool that works in Excel (XLMiner). For this reason, we think our book will be appropriate not just for students, but also for business analysts with a quantitative orientation, on, indeed, anyone who wants to learn data mining via self-study. Have we succeeded? You be the judge! - P. Bruce (for G. Shmueli and N. Patel)
Books:
- Dining Room & Banquet Management, 3E
- Disney War
- ebXML: Concepts and Application
- Electronic Commerce: A Managerial Perspective 2006 (4th Edition) (Pie)
- Enterprise Architecture As Strategy: Creating a Foundation for Business Execution
- Family Wealth--Keeping It in the Family: How Family Members and Their Advisers Preserve Human, Intellectual, and Financial Assets for Generations
- Fine Chocolates: Great Experience
- Food and Beverage Cost Control
- Food and Beverage Cost Control
- Food and Beverage Cost Control
Books Index
Books Home
Recommended Books
- Unbowed
- Minion
- Biology of Aging 2000 Version Custom
- Factor Analysis in Chemistry
- History: Fiction or Science
- Pattern Classification
- Memory Book: A Benny Cooperman Detective Novel
- 17th and 18th Century Art: Baroque Painting, Sculpture, Architecture
- City of Bits: Space, Place, and the Infobahn
- Collecting and studying mushrooms, toadstools and fungi