regex: improve error handling in RegexRepeatMatcher
Also expand unit test coverage and clean up the documentation
Change-Id: Ib679f869de352ef36b7ffa7a2db8381eb503e962
diff --git a/docs/tutorials/utils-ndn-regex.rst b/docs/tutorials/utils-ndn-regex.rst
index 52026ae..11ccf8a 100644
--- a/docs/tutorials/utils-ndn-regex.rst
+++ b/docs/tutorials/utils-ndn-regex.rst
@@ -4,12 +4,12 @@
NDN regular expression is a kind of regular expression that can match NDN names. Matching is
performed at two levels: the name level and the name component level.
-A name component matcher, enclosed in ``<`` and ``>``, specifies the pattern of a name component. The
-component pattern is expressed using the `Modified ECMAScript Regular Expression Syntax
+A name component matcher, enclosed in ``<`` and ``>``, specifies the pattern of a name component.
+The component pattern is expressed using the `Modified ECMAScript Regular Expression Syntax
<https://en.cppreference.com/w/cpp/regex/ecmascript>`_.
For example, ``<ab*c>`` matches the 1st, 3rd, and 4th components of ``/ac/dc/abc/abbc``, but does
-not match the 2nd component. A special case is that ``<>`` denotes a wildcard matcher that can match
-**ANY** name component.
+not match the 2nd component. A special case is that ``<>`` denotes a wildcard matcher that can
+match **ANY** name component.
A component matcher can match only one name component. To match a name, you need to compose an NDN
regular expression with zero or more name component matchers. For example, ``<ndn><edu><ucla>``
@@ -31,8 +31,8 @@
Repetition
~~~~~~~~~~
-A component matcher can be followed by a repeat quantifier to indicate how many times the preceding
-component may appear.
+A component matcher can be followed by a **repeat quantifier** to indicate how many times the
+preceding component may appear.
The ``*`` quantifier denotes "zero or more times". For example, ``^<A><B>*<C>$`` matches ``/A/C``,
``/A/B/C``, ``/A/B/B/C``, and so on.
@@ -43,44 +43,44 @@
The ``?`` quantifier denotes "zero or one time". For example, ``^<A><B>?<C>`` matches ``/A/C`` and
``/A/B/C``, but does not match ``/A/B/B/C``.
-A bounded quantifier specifies a minimum and maximum number of permitted matches: ``{n}`` denotes
-"exactly ``n`` times"; ``{n,}`` denotes "at least ``n`` times"; ``{,n}`` denotes "at most ``n``
-times"; ``{n, m}`` denotes "between ``n`` and ``m`` times (inclusive)". For example,
-``^<A><B>{2, 4}<C>$`` matches ``/A/B/B/C`` and ``/A/B/B/B/B/C``.
+A **bounded quantifier** specifies either a minimum or a maximum number of permitted matches, or
+both. ``{n}`` means "exactly ``n`` times"; ``{n,}`` means "at least ``n`` times"; ``{,n}`` means
+"at most ``n`` times"; ``{m,n}`` (with ``m ≤ n``) means "between ``m`` and ``n`` times (inclusive)".
+For example, ``^<A><B>{2,4}<C>$`` matches ``/A/B/B/C`` and ``/A/B/B/B/B/C``.
-Note that the quantifiers are **greedy**, which means it will consume as many matched components as
-possible. NDN regular expressions currently do not support non-greedy repeat matching and possessive
-repeat matching. For example, for the name ``/A/B/C/C/C``, ``^<A><B><C>+$`` will match the entire
-name instead of only ``/A/B/C``.
+Note that the quantifiers are *greedy*, meaning that they will consume as many matching components
+as possible. For example, for the name ``/A/B/C/C/C``, ``^<A><B><C>+`` will match the entire name
+instead of only ``/A/B/C``. NDN regular expressions do not currently support non-greedy quantifiers
+and possessive quantifiers.
Sets
~~~~
-A name component set, denoted by a bracket expression starting with ``[`` and ending with ``]``,
+A **name component set**, denoted by a bracket expression starting with ``[`` and ending with ``]``,
defines a set of name components. It matches any single name component that is a member of that set.
-Unlike standard regular expressions, NDN regular expression only supports **Single Components Set**,
+Unlike standard regular expressions, NDN regular expressions support only single components sets,
that is, you have to list all the set members one by one between the brackets. For example,
-``^[<ndn><localhost>]`` matches any names starting with either ``ndn"`` or ``localhost`` component.
+``^[<ndn><localhost>]`` matches any names starting with either ``ndn`` or ``localhost`` component.
-When a name component set starts with a ``'^'``, the set becomes a **Negation Set**. It matches the
+When a name component set starts with a ``'^'``, the set becomes a **negation set**. It matches the
complement of the contained name components. For example, ``^[^<ndn>]`` matches any non-empty name
-that does not start with ``ndn`` component.
+that does *not* start with ``/ndn``.
-Some other types of sets, such as Range Set, will be supported later.
+Some other types of sets, such as range sets, will be supported later.
Note that component sets may be repeated in the same way as component matchers.
Sub-pattern and Back Reference
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-A section beginning ``(`` and ending ``)`` acts as a marked sub-pattern. Whatever matched the
+A section beginning ``(`` and ending ``)`` acts as a **marked sub-pattern**. Whatever matched the
sub-pattern is split out in a separate field by the matching algorithm. For example
``^<A>(<>{2})<B>(<>)`` matches the name ``/A/C/D/B/E``, and the first sub-pattern captures ``C/D``.
-Marked sub-patterns can be referred to by a back-reference ``\N``, which references one or more
-capturing groups. In the example above, a back reference ``\1\2`` extracts ``/C/D/E`` out of the
-name.
+Marked sub-patterns can be referred to by a **back-reference** of the form ``\N``, which references
+one or more capturing groups. In the example above, a back reference ``\1\2`` extracts ``/C/D/E``
+out of the name.
Marked sub-patterns can also be repeated. The regex engine does not permanently substitute
back-references in a regular expression, but will use the last match saved into the back-reference.
diff --git a/ndn-cxx/util/regex/regex-backref-matcher.cpp b/ndn-cxx/util/regex/regex-backref-matcher.cpp
index 0b531ce..9725dd5 100644
--- a/ndn-cxx/util/regex/regex-backref-matcher.cpp
+++ b/ndn-cxx/util/regex/regex-backref-matcher.cpp
@@ -1,6 +1,6 @@
/* -*- Mode:C++; c-file-style:"gnu"; indent-tabs-mode:nil; -*- */
/*
- * Copyright (c) 2013-2019 Regents of the University of California.
+ * Copyright (c) 2013-2021 Regents of the University of California.
*
* This file is part of ndn-cxx library (NDN C++ library with eXperimental eXtensions).
*
@@ -35,16 +35,18 @@
void
RegexBackrefMatcher::compile()
{
- if (m_expr.size() < 2)
- NDN_THROW(Error("Unrecognized format: " + m_expr));
+ if (m_expr.size() < 2) {
+ NDN_THROW(Error("Invalid capture group syntax: " + m_expr));
+ }
size_t lastIndex = m_expr.size() - 1;
if ('(' == m_expr[0] && ')' == m_expr[lastIndex]) {
m_matchers.push_back(make_shared<RegexPatternListMatcher>(m_expr.substr(1, lastIndex - 1),
m_backrefManager));
}
- else
- NDN_THROW(Error("Unrecognized format: " + m_expr));
+ else {
+ NDN_THROW(Error("Invalid capture group syntax: " + m_expr));
+ }
}
} // namespace ndn
diff --git a/ndn-cxx/util/regex/regex-component-set-matcher.cpp b/ndn-cxx/util/regex/regex-component-set-matcher.cpp
index 6c2eca8..af186b5 100644
--- a/ndn-cxx/util/regex/regex-component-set-matcher.cpp
+++ b/ndn-cxx/util/regex/regex-component-set-matcher.cpp
@@ -1,6 +1,6 @@
/* -*- Mode:C++; c-file-style:"gnu"; indent-tabs-mode:nil; -*- */
/*
- * Copyright (c) 2013-2019 Regents of the University of California.
+ * Copyright (c) 2013-2021 Regents of the University of California.
*
* This file is part of ndn-cxx library (NDN C++ library with eXperimental eXtensions).
*
@@ -29,7 +29,6 @@
RegexComponentSetMatcher::RegexComponentSetMatcher(const std::string& expr,
shared_ptr<RegexBackrefManager> backrefManager)
: RegexMatcher(expr, EXPR_COMPONENT_SET, std::move(backrefManager))
- , m_isInclusion(true)
{
compile();
}
@@ -37,27 +36,30 @@
void
RegexComponentSetMatcher::compile()
{
- if (m_expr.size() < 2)
- NDN_THROW(Error("Regexp compile error (cannot parse " + m_expr + ")"));
+ if (m_expr.size() < 2) {
+ NDN_THROW(Error("Invalid component set syntax: " + m_expr));
+ }
switch (m_expr[0]) {
case '<':
return compileSingleComponent();
case '[': {
size_t lastIndex = m_expr.size() - 1;
- if (']' != m_expr[lastIndex])
- NDN_THROW(Error("Regexp compile error (no matching ']' in " + m_expr + ")"));
+ if (']' != m_expr[lastIndex]) {
+ NDN_THROW(Error("Missing ']' in regex: " + m_expr));
+ }
if ('^' == m_expr[1]) {
m_isInclusion = false;
compileMultipleComponents(2, lastIndex);
}
- else
+ else {
compileMultipleComponents(1, lastIndex);
+ }
break;
}
default:
- NDN_THROW(Error("Regexp compile error (cannot parse " + m_expr + ")"));
+ NDN_THROW(Error("Invalid component set syntax: " + m_expr));
}
}
@@ -66,7 +68,7 @@
{
size_t end = extractComponent(1);
if (m_expr.size() != end)
- NDN_THROW(Error("Component expr error " + m_expr));
+ NDN_THROW(Error("Component expr error: " + m_expr));
m_components.push_back(make_shared<RegexComponentMatcher>(m_expr.substr(1, end - 2), m_backrefManager));
}
@@ -79,7 +81,7 @@
while (index < lastIndex) {
if ('<' != m_expr[index])
- NDN_THROW(Error("Component expr error " + m_expr));
+ NDN_THROW(Error("Component expr error: " + m_expr));
tempIndex = index + 1;
index = extractComponent(tempIndex);
@@ -112,8 +114,8 @@
m_matchResult.push_back(name.get(offset));
return true;
}
- else
- return false;
+
+ return false;
}
size_t
@@ -124,15 +126,14 @@
while (lcount > rcount) {
switch (m_expr[index]) {
- case '<':
- lcount++;
- break;
- case '>':
- rcount++;
- break;
- case 0:
- NDN_THROW(Error("Square brackets mismatch"));
- break;
+ case '<':
+ lcount++;
+ break;
+ case '>':
+ rcount++;
+ break;
+ case 0:
+ NDN_THROW(Error("Angle brackets mismatch: " + m_expr));
}
index++;
}
diff --git a/ndn-cxx/util/regex/regex-component-set-matcher.hpp b/ndn-cxx/util/regex/regex-component-set-matcher.hpp
index 4b39afa..74370a5 100644
--- a/ndn-cxx/util/regex/regex-component-set-matcher.hpp
+++ b/ndn-cxx/util/regex/regex-component-set-matcher.hpp
@@ -61,7 +61,7 @@
private:
std::vector<shared_ptr<RegexComponentMatcher>> m_components;
- bool m_isInclusion;
+ bool m_isInclusion = true;
};
} // namespace ndn
diff --git a/ndn-cxx/util/regex/regex-repeat-matcher.cpp b/ndn-cxx/util/regex/regex-repeat-matcher.cpp
index 031edd0..e985016 100644
--- a/ndn-cxx/util/regex/regex-repeat-matcher.cpp
+++ b/ndn-cxx/util/regex/regex-repeat-matcher.cpp
@@ -25,7 +25,6 @@
#include "ndn-cxx/util/regex/regex-backref-matcher.hpp"
#include "ndn-cxx/util/regex/regex-component-set-matcher.hpp"
-#include <cstdlib>
#include <regex>
namespace ndn {
@@ -43,7 +42,8 @@
RegexRepeatMatcher::compile()
{
if ('(' == m_expr[0]) {
- auto matcher = make_shared<RegexBackrefMatcher>(m_expr.substr(0, m_indicator), m_backrefManager);
+ auto matcher = make_shared<RegexBackrefMatcher>(m_expr.substr(0, m_indicator),
+ m_backrefManager);
m_backrefManager->pushRef(matcher);
matcher->compile();
m_matchers.push_back(std::move(matcher));
@@ -56,75 +56,70 @@
parseRepetition();
}
-bool
+void
RegexRepeatMatcher::parseRepetition()
{
+ constexpr size_t MAX_REPETITIONS = std::numeric_limits<size_t>::max();
size_t exprSize = m_expr.size();
- const size_t MAX_REPETITIONS = std::numeric_limits<size_t>::max();
if (exprSize == m_indicator) {
m_repeatMin = 1;
m_repeatMax = 1;
- return true;
}
-
- if (exprSize == (m_indicator + 1)) {
- if ('?' == m_expr[m_indicator]) {
+ else if (exprSize == m_indicator + 1) {
+ switch (m_expr[m_indicator]) {
+ case '?':
m_repeatMin = 0;
m_repeatMax = 1;
- return true;
- }
- if ('+' == m_expr[m_indicator]) {
+ break;
+ case '+':
m_repeatMin = 1;
m_repeatMax = MAX_REPETITIONS;
- return true;
- }
- if ('*' == m_expr[m_indicator]) {
+ break;
+ case '*':
m_repeatMin = 0;
m_repeatMax = MAX_REPETITIONS;
- return true;
+ break;
+ default:
+ NDN_THROW(Error("Unrecognized quantifier '"s + m_expr[m_indicator] + "' in regex: " + m_expr));
}
}
else {
std::string repeatStruct = m_expr.substr(m_indicator, exprSize - m_indicator);
size_t rsSize = repeatStruct.size();
- size_t min = 0;
- size_t max = 0;
- if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+,[0-9]+\\}"))) {
- size_t separator = repeatStruct.find_first_of(',', 0);
- min = std::atoi(repeatStruct.substr(1, separator - 1).data());
- max = std::atoi(repeatStruct.substr(separator + 1, rsSize - separator - 2).data());
+ try {
+ if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+,[0-9]+\\}"))) {
+ size_t separator = repeatStruct.find_first_of(',', 0);
+ m_repeatMin = std::stoul(repeatStruct.substr(1, separator - 1));
+ m_repeatMax = std::stoul(repeatStruct.substr(separator + 1, rsSize - separator - 2));
+ if (m_repeatMin > m_repeatMax) {
+ NDN_THROW(Error("Invalid number of repetitions '" + repeatStruct + "' in regex: " + m_expr));
+ }
+ }
+ else if (std::regex_match(repeatStruct, std::regex("\\{,[0-9]+\\}"))) {
+ size_t separator = repeatStruct.find_first_of(',', 0);
+ m_repeatMin = 0;
+ m_repeatMax = std::stoul(repeatStruct.substr(separator + 1, rsSize - separator - 2));
+ }
+ else if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+,\\}"))) {
+ size_t separator = repeatStruct.find_first_of(',', 0);
+ m_repeatMin = std::stoul(repeatStruct.substr(1, separator));
+ m_repeatMax = MAX_REPETITIONS;
+ }
+ else if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+\\}"))) {
+ m_repeatMin = std::stoul(repeatStruct.substr(1, rsSize - 1));
+ m_repeatMax = m_repeatMin;
+ }
+ else {
+ NDN_THROW(Error("Invalid quantifier '" + repeatStruct + "' in regex: " + m_expr));
+ }
}
- else if (std::regex_match(repeatStruct, std::regex("\\{,[0-9]+\\}"))) {
- size_t separator = repeatStruct.find_first_of(',', 0);
- min = 0;
- max = std::atoi(repeatStruct.substr(separator + 1, rsSize - separator - 2).data());
+ // std::stoul can throw invalid_argument or out_of_range, both are derived from logic_error
+ catch (const std::logic_error&) {
+ NDN_THROW_NESTED(Error("Invalid number of repetitions '" + repeatStruct + "' in regex: " + m_expr));
}
- else if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+,\\}"))) {
- size_t separator = repeatStruct.find_first_of(',', 0);
- min = std::atoi(repeatStruct.substr(1, separator).data());
- max = MAX_REPETITIONS;
- }
- else if (std::regex_match(repeatStruct, std::regex("\\{[0-9]+\\}"))) {
- min = std::atoi(repeatStruct.substr(1, rsSize - 1).data());
- max = min;
- }
- else {
- NDN_THROW(Error("parseRepetition: unrecognized format " + m_expr));
- }
-
- if (min > MAX_REPETITIONS || max > MAX_REPETITIONS || min > max) {
- NDN_THROW(Error("parseRepetition: wrong number " + m_expr));
- }
-
- m_repeatMin = min;
- m_repeatMax = max;
-
- return true;
}
-
- return false;
}
bool
@@ -132,13 +127,14 @@
{
m_matchResult.clear();
- if (m_repeatMin == 0)
- if (len == 0)
- return true;
+ if (m_repeatMin == 0 && len == 0) {
+ return true;
+ }
if (recursiveMatch(0, name, offset, len)) {
- for (size_t i = offset; i < offset + len; i++)
+ for (size_t i = offset; i < offset + len; i++) {
m_matchResult.push_back(name.get(i));
+ }
return true;
}
@@ -148,8 +144,6 @@
bool
RegexRepeatMatcher::recursiveMatch(size_t repeat, const Name& name, size_t offset, size_t len)
{
- ssize_t tried = len;
-
if (0 < len && repeat >= m_repeatMax) {
return false;
}
@@ -162,11 +156,13 @@
return true;
}
- auto matcher = m_matchers[0];
+ const auto& matcher = m_matchers[0];
+ ssize_t tried = static_cast<ssize_t>(len);
while (tried >= 0) {
if (matcher->match(name, offset, tried) &&
- recursiveMatch(repeat + 1, name, offset + tried, len - tried))
+ recursiveMatch(repeat + 1, name, offset + tried, len - tried)) {
return true;
+ }
tried--;
}
diff --git a/ndn-cxx/util/regex/regex-repeat-matcher.hpp b/ndn-cxx/util/regex/regex-repeat-matcher.hpp
index 6ba336a..547905d 100644
--- a/ndn-cxx/util/regex/regex-repeat-matcher.hpp
+++ b/ndn-cxx/util/regex/regex-repeat-matcher.hpp
@@ -42,7 +42,7 @@
void
compile();
- bool
+ void
parseRepetition();
bool
@@ -50,8 +50,8 @@
private:
size_t m_indicator;
- size_t m_repeatMin;
- size_t m_repeatMax;
+ size_t m_repeatMin = 0;
+ size_t m_repeatMax = 0;
};
} // namespace ndn
diff --git a/tests/unit/util/regex.t.cpp b/tests/unit/util/regex.t.cpp
index b4fc02e..ca15b66 100644
--- a/tests/unit/util/regex.t.cpp
+++ b/tests/unit/util/regex.t.cpp
@@ -44,20 +44,20 @@
{
shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
shared_ptr<RegexComponentMatcher> cm = make_shared<RegexComponentMatcher>("a", backRef);
- bool res = cm->match(Name("/a/b/"), 0, 1);
+ bool res = cm->match(Name("/a/b"), 0, 1);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 1);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexComponentMatcher>("a", backRef);
- res = cm->match(Name("/a/b/"), 1, 1);
+ res = cm->match(Name("/a/b"), 1, 1);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexComponentMatcher>("(c+)\\.(cd)", backRef);
- res = cm->match(Name("/ccc.cd/b/"), 0, 1);
+ res = cm->match(Name("/ccc.cd/b"), 0, 1);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 1);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("ccc.cd"));
@@ -69,18 +69,18 @@
BOOST_AUTO_TEST_CASE(ComponentSetMatcher)
{
- shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
- shared_ptr<RegexComponentSetMatcher> cm = make_shared<RegexComponentSetMatcher>("<a>", backRef);
- bool res = cm->match(Name("/a/b/"), 0, 1);
+ auto backRef = make_shared<RegexBackrefManager>();
+ auto cm = make_shared<RegexComponentSetMatcher>("<a>", backRef);
+ bool res = cm->match(Name("/a/b"), 0, 1);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 1);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- res = cm->match(Name("/a/b/"), 1, 1);
+ res = cm->match(Name("/a/b"), 1, 1);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
- res = cm->match(Name("/a/b/"), 0, 2);
+ res = cm->match(Name("/a/b"), 0, 2);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
@@ -101,12 +101,21 @@
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 1);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("d"));
+
+ backRef = make_shared<RegexBackrefManager>();
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("(<a><b>)", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("[<a><b>", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("<a", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("<a>b", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("<><<>", backRef), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexComponentSetMatcher>("[abc]", backRef), RegexMatcher::Error);
}
BOOST_AUTO_TEST_CASE(RepeatMatcher)
{
- shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
- shared_ptr<RegexRepeatMatcher> cm = make_shared<RegexRepeatMatcher>("[<a><b>]*", backRef, 8);
+ auto backRef = make_shared<RegexBackrefManager>();
+ auto cm = make_shared<RegexRepeatMatcher>("[<a><b>]*", backRef, 8);
bool res = cm->match(Name("/a/b/c"), 0, 0);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
@@ -131,7 +140,7 @@
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("<.*>*", backRef, 4);
- res = cm->match(Name("/a/b/c/d/e/f/"), 0, 6);
+ res = cm->match(Name("/a/b/c/d/e/f"), 0, 6);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 6);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
@@ -143,7 +152,7 @@
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("<>*", backRef, 2);
- res = cm->match(Name("/a/b/c/d/e/f/"), 0, 6);
+ res = cm->match(Name("/a/b/c/d/e/f"), 0, 6);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 6);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
@@ -172,53 +181,53 @@
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("[<a><b>]{3}", backRef, 8);
- res = cm->match(Name("/a/b/a/d/"), 0, 2);
+ res = cm->match(Name("/a/b/a/d"), 0, 2);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
- res = cm->match(Name("/a/b/a/d/"), 0, 3);
+ res = cm->match(Name("/a/b/a/d"), 0, 3);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 3);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[2].toUri(), string("a"));
- res = cm->match(Name("/a/b/a/d/"), 0, 4);
+ res = cm->match(Name("/a/b/a/d"), 0, 4);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("[<a><b>]{2,3}", backRef, 8);
- res = cm->match(Name("/a/b/a/d/e/"), 0, 2);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
- res = cm->match(Name("/a/b/a/d/e/"), 0, 3);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 3);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 3);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[2].toUri(), string("a"));
- res = cm->match(Name("/a/b/a/b/e/"), 0, 4);
+ res = cm->match(Name("/a/b/a/b/e"), 0, 4);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
- res = cm->match(Name("/a/b/a/d/e/"), 0, 1);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 1);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("[<a><b>]{2,}", backRef, 8);
- res = cm->match(Name("/a/b/a/d/e/"), 0, 2);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
- res = cm->match(Name("/a/b/a/b/e/"), 0, 4);
+ res = cm->match(Name("/a/b/a/b/e"), 0, 4);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 4);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
@@ -226,43 +235,58 @@
BOOST_CHECK_EQUAL(cm->getMatchResult()[2].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[3].toUri(), string("b"));
- res = cm->match(Name("/a/b/a/d/e/"), 0, 1);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 1);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
backRef = make_shared<RegexBackrefManager>();
cm = make_shared<RegexRepeatMatcher>("[<a><b>]{,2}", backRef, 8);
- res = cm->match(Name("/a/b/a/b/e/"), 0, 3);
+ res = cm->match(Name("/a/b/a/b/e"), 0, 3);
BOOST_CHECK_EQUAL(res, false);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
- res = cm->match(Name("/a/b/a/b/e/"), 0, 2);
+ res = cm->match(Name("/a/b/a/b/e"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
- res = cm->match(Name("/a/b/a/d/e/"), 0, 1);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 1);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 1);
BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- res = cm->match(Name("/a/b/a/d/e/"), 0, 0);
+ res = cm->match(Name("/a/b/a/d/e"), 0, 0);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 0);
+
+ backRef = make_shared<RegexBackrefManager>();
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>!", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>@", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>##", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{,}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>1,2", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{foo,bar}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{0x12,0x34}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{10,5}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{99999999999999999999,}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{,99999999999999999999}", backRef, 2), RegexMatcher::Error);
+ BOOST_CHECK_THROW(make_shared<RegexRepeatMatcher>("<>{1,2,3}", backRef, 2), RegexMatcher::Error);
}
-BOOST_AUTO_TEST_CASE(BackRefMatcher)
+BOOST_AUTO_TEST_CASE(BackrefMatcher)
{
- shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
- shared_ptr<RegexBackrefMatcher> cm = make_shared<RegexBackrefMatcher>("(<a><b>)", backRef);
+ auto backRef = make_shared<RegexBackrefManager>();
+ auto cm = make_shared<RegexBackrefMatcher>("(<a><b>)", backRef);
backRef->pushRef(static_pointer_cast<RegexMatcher>(cm));
cm->compile();
bool res = cm->match(Name("/a/b/c"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
- BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), "b");
BOOST_CHECK_EQUAL(backRef->size(), 1);
backRef = make_shared<RegexBackrefManager>();
@@ -272,41 +296,48 @@
res = cm->match(Name("/a/b/c"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
- BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), "b");
BOOST_CHECK_EQUAL(backRef->size(), 2);
- BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[1].toUri(), string("b"));
- BOOST_CHECK_EQUAL(backRef->getBackref(1)->getMatchResult()[0].toUri(), string("b"));
+ BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[1].toUri(), "b");
+ BOOST_CHECK_EQUAL(backRef->getBackref(1)->getMatchResult()[0].toUri(), "b");
+
+ backRef = make_shared<RegexBackrefManager>();
+ cm = make_shared<RegexBackrefMatcher>("", backRef);
+ BOOST_CHECK_THROW(cm->compile(), RegexMatcher::Error);
+ cm = make_shared<RegexBackrefMatcher>("(", backRef);
+ BOOST_CHECK_THROW(cm->compile(), RegexMatcher::Error);
+ cm = make_shared<RegexBackrefMatcher>("(<a><b>", backRef);
+ BOOST_CHECK_THROW(cm->compile(), RegexMatcher::Error);
+ cm = make_shared<RegexBackrefMatcher>("[<a><b>)", backRef);
+ BOOST_CHECK_THROW(cm->compile(), RegexMatcher::Error);
}
-BOOST_AUTO_TEST_CASE(BackRefMatcherAdvanced)
+BOOST_AUTO_TEST_CASE(BackrefMatcherAdvanced)
{
- shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
- shared_ptr<RegexRepeatMatcher> cm = make_shared<RegexRepeatMatcher>("([<a><b>])+", backRef, 10);
+ auto backRef = make_shared<RegexBackrefManager>();
+ shared_ptr<RegexMatcher> cm = make_shared<RegexRepeatMatcher>("([<a><b>])+", backRef, 10);
bool res = cm->match(Name("/a/b/c"), 0, 2);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 2);
- BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), "b");
BOOST_CHECK_EQUAL(backRef->size(), 1);
- BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), string("b"));
-}
+ BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), "b");
-BOOST_AUTO_TEST_CASE(BackRefMatcherAdvanced2)
-{
- shared_ptr<RegexBackrefManager> backRef = make_shared<RegexBackrefManager>();
- shared_ptr<RegexPatternListMatcher> cm = make_shared<RegexPatternListMatcher>("(<a>(<b>))<c>", backRef);
- bool res = cm->match(Name("/a/b/c"), 0, 3);
+ backRef = make_shared<RegexBackrefManager>();
+ cm = make_shared<RegexPatternListMatcher>("(<a>(<b>))<c>", backRef);
+ res = cm->match(Name("/a/b/c"), 0, 3);
BOOST_CHECK_EQUAL(res, true);
BOOST_CHECK_EQUAL(cm->getMatchResult().size(), 3);
- BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), string("b"));
- BOOST_CHECK_EQUAL(cm->getMatchResult()[2].toUri(), string("c"));
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[1].toUri(), "b");
+ BOOST_CHECK_EQUAL(cm->getMatchResult()[2].toUri(), "c");
BOOST_CHECK_EQUAL(backRef->size(), 2);
- BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), string("a"));
- BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[1].toUri(), string("b"));
- BOOST_CHECK_EQUAL(backRef->getBackref(1)->getMatchResult()[0].toUri(), string("b"));
+ BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[0].toUri(), "a");
+ BOOST_CHECK_EQUAL(backRef->getBackref(0)->getMatchResult()[1].toUri(), "b");
+ BOOST_CHECK_EQUAL(backRef->getBackref(1)->getMatchResult()[0].toUri(), "b");
}
BOOST_AUTO_TEST_CASE(PatternListMatcher)