Source From Here
I’m going to break it down, step by step, and take you through the concepts of regular expressions in Ruby and on to advanced techniques. My hope is that you’ll see the beauty of regular expressions and be able to move beyond the fear and intimidation and embrace them in your code.
A regular expression is just a pattern. It’s a pattern that a string either fits or it doesn’t. Programming Ruby 1.9 by Dave Thomas (more commonly known as the pickaxe book) sums up what you do with regular expressions in three words--test, extract, change. You can test a string to see if it matches your pattern. You can extract a string (or part of a string) that matches your pattern. And you can change a string, replacing parts that match your pattern with other text. Test, extract, change. So simple, yet so powerful.
Regular Expressions in Ruby
Ruby lets us take regular expressions to further heights. In Ruby, everything is an object. This includes regular expressions. Since you can send messages to objects, you can also send messages to regular expressions. You can also assign them to variables, pass them to methods, and more.
As of Ruby 1.9, Ruby uses the Oniguruma regular expressions library. It provides all the standard regular expressions features that of other Regular regular Expressions expressions library, but adds additional features and extensions. It handles complex, multi-byte characters (such as Japanese text) very well. A feature I particularly like is I can use /h and /H as shorthand for hexadecimal digits.
I recently found out Ruby 2.0 use a different regular expressions library, Onigmo. Onigmo is forked off of Oniguruma and will add even more features that can be harnessed by Ruby. It will be quite interesting to see what it brings. Regardless of what changes Onigmo makes, it won’t change the fundamental thing you do with a regular expression. You match your regular expression to a string. You compose a pattern and match a string to it.
In most of my Ruby regular expressions, I use the =~ operator. This is Ruby’s basic pattern matching operator. When I use this operator, I’m asking Ruby "Does this string contain this pattern?". Let’s go through an example:
On the left side I have a regular expression, which in this case is the literal word force. On the other side I have a quote from one of my very favorite movies, Star Wars. "Use the force." When I run this, I’m asking Ruby if my pattern, on the left, exists in the string on the right. What’s nice is I can format it the opposite way if I want to.
I can have the string on the left and the regular expression on the right. I’m running the same function, just in different words - does my string contain my regular expression? Some people find this a little more readable. When I run this, it returns the character number where my match starts. In this case, my match for the pattern /force/ begins on the eighth character of my string.
I can also test if a string DOES NOT match a pattern by using the operator !~. This just returns a Boolean of true or false.
Operators are great for basic matching, knowing whether my string matches my regular expression or does not. But Ruby provides me much more information about my match. All I have to do is ask.
The secret is to make my match a MatchData object. I create this object using the match method. Let’s say I have a string.
And I want to find out if this string contains the word force. I call the match method on my regular expression and pass it my string. When I run this, it returns an instance of the MatchData class for my particular match:
As of Ruby 1.9, I actually don’t have to start my match at the beginning of my string. I can pass a second argument, an integer, which means start my match on that character number.
In this case, it returns nil. In order to find a match for the word force, it needs to start matching earlier in the string. For this example, however, I’m just going to pass my string the first way. I have access to methods which give me MUCH more information about my match because my match is an instance of the MatchData class.
If I call to_s on my match, it will return the entire text of my match. If I call pre_match on my match, it will return the portion of my string that comes BEFORE my match:
If I call post_match on my match, it will return the portion of my string that comes AFTER my match:
All these methods (and there are several more) are quite useful. However, MatchData truly shines when it comes to capture groups, though. Capture groups are subexpressions within your regular expression. Let’s look at an example.
In order for a string to match this regular expression, it has to have any character appearing any number of times - that’s what the .* means- followed by the word force, followed by any character appearing any number of times. Notice that I have the first and last part of this expression in parentheses. These are called groups. When I run this match on my string, it will store matches for my groups in memory. I can access these groups and use the in other places in my code.
MatchData objects are a lot like arrays. I can access individual captures using bracket notation, similar to how I access individual elements of an array. If I runmy_match, I get my first capture group, "The ". Similarly, my_match returnd my second capture group, " will be with you always".
Notice I didn’t start with my_match, like I would with the first element of an array. If I run my_match, rather than getting my first match, I get the full string I ran the match on.
It’s important to remember that although MatchData objects are similar to Arrays, they’re not actually arrays. If I try to call an array method like each on my MatchDataobject, I get back a no method error. I can easily fix this, however, by converting my my MatchData to an array using the to_a method.
* Using Regular Expression in Ruby - Part2
* Using Regular Expression in Ruby - Part3
- [ 英文學習 ] (7)
- [ 計算機概論 ] (1)
- [ 深入雲計算 ] (8)
- [ 雜七雜八 ] (5)
- [ Algorithm in Java ] (26)
- [ Data Structures with Java ] (82)
- [ IR Class ] (19)
- [ Java 文章收集 ] (21)
- [ Java 代碼範本 ] (42)
- [ Java 套件 ] (11)
- [ JVM 應用 ] (7)
- [ LFD Note ] (2)
- [ MangoDB ] (7)
- [ Math CC ] (3)
- [ MongoDB ] (5)
- [ MySQL 小學堂 ] (1)
- [ Python 考題 ] (2)
- [ Python 常見問題 ] (11)
- [ Python 範例代碼 ] (7)
- [心得扎記] (1)
- [網路教學] (3)
- [C 常見考題] (2)
- [C 範例代碼] (4)
- [C/C++ 範例代碼] (18)
- [Intro Alg] (4)
- [Java 代碼範本] (25)
- [Java 套件] (15)
- [Linux 命令] (60)
- [Linux 小技巧] (34)
- [Linux 小學堂] (31)
- [ML In Action] (14)
- [ML] (42)
- [MLP] (7)
- [Python 學習筆記] (1)
- [Quick Python] (20)
- [Software Engineering] (8)
- [The python tutorial] (7)
- 工具收集 (21)
- 設計模式 (18)
- 資料結構 (68)
- ActiveMQ In Action (13)
- AI (6)
- Algorithm (4)
- Android (11)
- Big Data 研究 (15)
- C/C++ (68)
- C++ (19)
- CCDH (20)
- Coursera (2)
- Database (1)
- Design Pattern (1)
- Device Driver Programming (42)
- Docker (31)
- Docker 工具 (1)
- Docker Practice (6)
- Eclipse (1)
- English Writing (52)
- ExtJS 3.x (4)
- FP (1)
- FreeBSD (1)
- GCC (2)
- Git (4)
- Git Pro (4)
- GNU (30)
- Groovy (81)
- Hadoop (65)
- Hadoop. Hadoop Ecosystem (1)
- Java (256)
- Java Framework (1)
- Java UI (3)
- JavaIDE (2)
- JFreeChart (2)
- Kali/Metasploit (6)
- KVM (1)
- Learn Spark (10)
- Linux (232)
- Lucene (19)
- Math (8)
- MPI (3)
- Nachos (4)
- Network (3)
- NLP (1)
- OO (28)
- OpenCL (1)
- OpenMP (3)
- OSC (1)
- OSGi (10)
- Perl (24)
- Python (182)
- Python Std Library (35)
- Python tools (4)
- QEMU (1)
- R (1)
- RIA (14)
- RTC (5)
- Ruby (68)
- Ruby Packages (8)
- Scala (75)
- ScalaIA (15)
- TensorFlow (1)
- Tools (11)
- UML (2)
- Unix (18)
- Verilog (3)
- Vmware (2)
- Windows 技巧 (10)
- ► 2016 (273)
- ► 2015 (276)
- [ 常見問題 ] How do you add an array to another array ...
- [ 文章收集 ] RB Learning - More On Strings
- [Toolkit] Simple Web Service - SimpleHttp.groovy -...
- [文章收集] How to Install, Run and Uninstall VMware Pl...
- [ Ruby Gossip ] Basic : 類別 - 特殊方法定義
- [ 範例代碼 ] Array's collect with index?
- [文章收集] Snort : Customized AppId Lua Script as Dete...
- [ InAction Note ] Ch2. Understand - Concept of mes...
- [ Ruby Gossip ] Basic : 類別 - 定義類別
- [ Ruby Gossip ] Basic : 方法 - 變數範圍
- [ 文章收集 ] RB Learning - Fun with Strings
- [ 文章收集 ] RB Learning - Writing Own Ruby Methods
- [ 文章收集 ] Serializing (And Deserializing) Objects W...
- [Linux 文章收集] Linux CLI 提示字元的設定
- [ Ruby Gossip ] Basic : 方法 - 迭代器與程式區塊
- [文章收集] Snort : Firing up OpenAppID
- [ Ruby Gossip ] Basic : 方法 - def 定義方法
- [ 文章收集 ] How to Sort a Hash in Ruby
- [Linux 小技巧] 設置 Jumbo frame
- [ Ruby Gossip ] Basic : 流程控制 - begin...rescue...en...
- [ Ruby Gossip ] Basic : 流程控制 - case...when...else
- [ Ruby Gossip ] Basic : 流程控制 - while、until、loop 與 ...
- [ Ruby Gossip ] Basic : 流程控制 - if 與 unless
- [ Ruby Gossip ] Basic : 變數、操作與物件 - 淺談物件、訊息與方法
- [ 文章收集 ] How Ruby method dispatch works
- [ Ruby Gossip ] Basic : 變數、操作與物件 - 操作方法
- [ Ruby Gossip ] Basic : 變數、操作與物件 - 變數
- [ 文章收集 ] alias vs alias_method
- [ Ruby Gossip ] Basic : 內建型態與操作 - 範圍型態
- [ Ruby Gossip ] Basic : 內建型態與操作 - 雜湊型態
- [ Ruby Gossip ] Basic : 內建型態與操作 - 陣列型態
- [ 文章收集 ] Using Regular Expression in Ruby - Part3
- [ Ruby Gossip ] Basic : 內建型態與操作 - 符號型態
- [ 文章收集 ] Using Regular Expression in Ruby - Part2
- [ 文章收集 ] Using Regular Expression in Ruby - Part1
- [Toolkit] Simple Web Service - SimpleHttp.groovy
- [ Ruby Gossip ] Basic : 內建型態與操作 - 關於編碼
- [ Ruby Gossip ] Basic : 內建型態與操作 - 字串型態
- [ Ruby Gossip ] Basic : 內建型態與操作 - 數值型態
- [ Metasploit 常見問題 ] Database not connected or cach...
- [Ubuntu 常見問題] Why is chkconfig no longer available...
- [ Python 文章收集 ] python Pexpect
- [ Python 範例代碼 ] Auto-login script for vpnc
- [ Ruby Gossip ] Basic : 基本指令與觀念 - 基本輸入輸出
- [ Ruby Gossip ] Basic : 基本指令與觀念 - load 與 require
- ▼ 十月 (45)
- ► 2013 (112)
- ► 2012 (197)
- ► 2011 (265)